Re: [PATCH net v3] ibmveth: set correct gso_size and gso_type

2016-12-13 Thread Thomas Falcon
On 12/10/2016 02:56 PM, David Miller wrote:
> From: Thomas Falcon 
> Date: Sat, 10 Dec 2016 12:39:48 -0600
>
>> v3: include a check for non-zero mss when calculating gso_segs
>>
>> v2: calculate gso_segs after Eric Dumazet's comments on the earlier patch
>> and make sure everyone is included on CC
> I already applied v1 which made it all the way even to Linus's
> tree.  So you'll have to send me relative fixups if there are
> things to fix or change since v1.
>
> You must always generate patches against the current 'net' tree.
>
Sorry about that.  Thank you for applying it on short notice.  I agree that 
using the TCP checksum field is not ideal, but it was a compromise with the 
VIOS team.  Hopefully, there will be a better way in the future.

Thanks again,

Tom 



Re: [PATCH net v3] ibmveth: set correct gso_size and gso_type

2016-12-10 Thread David Miller
From: Thomas Falcon 
Date: Sat, 10 Dec 2016 12:39:48 -0600

> v3: include a check for non-zero mss when calculating gso_segs
> 
> v2: calculate gso_segs after Eric Dumazet's comments on the earlier patch
> and make sure everyone is included on CC

I already applied v1 which made it all the way even to Linus's
tree.  So you'll have to send me relative fixups if there are
things to fix or change since v1.

You must always generate patches against the current 'net' tree.


[PATCH net v3] ibmveth: set correct gso_size and gso_type

2016-12-10 Thread Thomas Falcon
This patch is based on an earlier one submitted
by Jon Maxwell with the following commit message:

"We recently encountered a bug where a few customers using ibmveth on the
same LPAR hit an issue where a TCP session hung when large receive was
enabled. Closer analysis revealed that the session was stuck because the
one side was advertising a zero window repeatedly.

We narrowed this down to the fact the ibmveth driver did not set gso_size
which is translated by TCP into the MSS later up the stack. The MSS is
used to calculate the TCP window size and as that was abnormally large,
it was calculating a zero window, even although the sockets receive buffer
was completely empty."

We rely on the Virtual I/O Server partition in a pseries
environment to provide the MSS through the TCP header checksum
field. The stipulation is that users should not disable checksum
offloading if rx packet aggregation is enabled through VIOS.

Some firmware offerings provide the MSS in the RX buffer.
This is signalled by a bit in the RX queue descriptor.

Reviewed-by: Brian King 
Reviewed-by: Pradeep Satyanarayana 
Reviewed-by: Marcelo Ricardo Leitner 
Reviewed-by: Jonathan Maxwell 
Reviewed-by: David Dai 
Signed-off-by: Thomas Falcon 
---
v3: include a check for non-zero mss when calculating gso_segs

v2: calculate gso_segs after Eric Dumazet's comments on the earlier patch
and make sure everyone is included on CC
---
 drivers/net/ethernet/ibm/ibmveth.c | 72 --
 drivers/net/ethernet/ibm/ibmveth.h |  1 +
 2 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
b/drivers/net/ethernet/ibm/ibmveth.c
index ebe6071..6dc24a1 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -58,7 +58,7 @@
 
 static const char ibmveth_driver_name[] = "ibmveth";
 static const char ibmveth_driver_string[] = "IBM Power Virtual Ethernet 
Driver";
-#define ibmveth_driver_version "1.05"
+#define ibmveth_driver_version "1.06"
 
 MODULE_AUTHOR("Santiago Leon ");
 MODULE_DESCRIPTION("IBM Power Virtual Ethernet Driver");
@@ -137,6 +137,11 @@ static inline int ibmveth_rxq_frame_offset(struct 
ibmveth_adapter *adapter)
return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_OFF_MASK;
 }
 
+static inline int ibmveth_rxq_large_packet(struct ibmveth_adapter *adapter)
+{
+   return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_LRG_PKT;
+}
+
 static inline int ibmveth_rxq_frame_length(struct ibmveth_adapter *adapter)
 {
return 
be32_to_cpu(adapter->rx_queue.queue_addr[adapter->rx_queue.index].length);
@@ -1174,6 +1179,52 @@ static netdev_tx_t ibmveth_start_xmit(struct sk_buff 
*skb,
goto retry_bounce;
 }
 
+static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
+{
+   struct tcphdr *tcph;
+   int offset = 0;
+   int hdr_len;
+
+   /* only TCP packets will be aggregated */
+   if (skb->protocol == htons(ETH_P_IP)) {
+   struct iphdr *iph = (struct iphdr *)skb->data;
+
+   if (iph->protocol == IPPROTO_TCP) {
+   offset = iph->ihl * 4;
+   skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+   } else {
+   return;
+   }
+   } else if (skb->protocol == htons(ETH_P_IPV6)) {
+   struct ipv6hdr *iph6 = (struct ipv6hdr *)skb->data;
+
+   if (iph6->nexthdr == IPPROTO_TCP) {
+   offset = sizeof(struct ipv6hdr);
+   skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+   } else {
+   return;
+   }
+   } else {
+   return;
+   }
+   /* if mss is not set through Large Packet bit/mss in rx buffer,
+* expect that the mss will be written to the tcp header checksum.
+*/
+   tcph = (struct tcphdr *)(skb->data + offset);
+   hdr_len = offset + tcph->doff * 4;
+   if (lrg_pkt) {
+   skb_shinfo(skb)->gso_size = mss;
+   } else if (offset) {
+   skb_shinfo(skb)->gso_size = ntohs(tcph->check);
+   tcph->check = 0;
+   }
+
+   if (skb_shinfo(skb)->gso_size)
+   skb_shinfo(skb)->gso_segs =
+   DIV_ROUND_UP(skb->len - hdr_len,
+skb_shinfo(skb)->gso_size);
+}
+
 static int ibmveth_poll(struct napi_struct *napi, int budget)
 {
struct ibmveth_adapter *adapter =
@@ -1182,6 +1233,7 @@ static int ibmveth_poll(struct napi_struct *napi, int 
budget)
int frames_processed = 0;
unsigned long lpar_rc;
struct iphdr *iph;
+   u16 mss = 0;
 
 restart_poll:
while (frames_processed < budget) {
@@ -1199,9 +1251,21 @@ static int ibmveth_poll(struct napi_struct *napi, int 
budget)
int length = ibmveth_rxq_frame_length(adapter);
int offset = ibmveth_rxq_frame_offset(ad