Hi,

Regarding the issue below.

I ran more tests and found that setting the CHECKSUM_PARTIAL flag corrupts
the checksum in non mangled NAT packets when hitting the
queue_userspace_packet() function (as reported below).
When I tried to reset the flag after the helper function call, non mangled
packets had the correct checksum but mangled FTP packets were still
incorrect.
Instead of setting the CHECKSUM_PARTIAL to avoid hitting the skb_dst on pre
kernel 4.5, I have attached a patch that gives the skb the required skb_dst
and the flags that are checked in the kernel checksum calculation.
This essentially causes the pre kernel 4.5 version to hit the same code as
post 4.5 and recalculate the layer 4 checksums in the Netfilter modules.
I have tested this and the checksums are now correct for both mangled and
non mangled NATed packet that go through the FTP helper.

Is it ok to submit this patch here? If this approach seems valid I can
prepare it for the dev email list.

Thanks,
John



---------- Forwarded message ----------
From: John Hurley <[email protected]>
Date: Tue, Dec 20, 2016 at 5:22 PM
Subject: TCP Checksum issue on OVS 2.6.1
To: [email protected]


Hi,

I am playing about with NAT in OVS 2.6.1 and have come across an issue with
TCP checksums when helpers are being used.

I have a setup of client -> OVS -> ftp server (all physical ports with no
checksum offloading enabled the network cards). OVS is running kernel
version 3.13.

I am using rules such as these for testing purposes:

 cookie=0x0, duration=121.467s, table=0, n_packets=0, n_bytes=0,
idle_age=121, ct_state=-trk,ip,in_port=4 actions=ct(commit,table=0,nat)
 cookie=0x0, duration=121.453s, table=0, n_packets=0, n_bytes=0,
idle_age=121, ct_state=-trk,ip,in_port=3 actions=ct(commit,table=0,nat(
src=10.0.0.5))
 cookie=0x0, duration=121.462s, table=0, n_packets=0, n_bytes=0,
idle_age=121, ct_state=+rel+trk,in_port=4 actions=output:3
 cookie=0x0, duration=121.458s, table=0, n_packets=0, n_bytes=0,
idle_age=121, ct_state=-rel+trk,in_port=4 actions=output:3
 cookie=0x0, duration=121.449s, table=0, n_packets=0, n_bytes=0,
idle_age=121, ct_state=+rel+trk,in_port=3 actions=output:4
 cookie=0x0, duration=121.444s, table=0, n_packets=0, n_bytes=0,
idle_age=121, ct_state=-rel+trk,in_port=3 actions=output:4
 cookie=0x0, duration=121.440s, table=0, n_packets=0, n_bytes=0,
idle_age=121, priority=0,arp,in_port=3 actions=output:4
 cookie=0x0, duration=121.435s, table=0, n_packets=0, n_bytes=0,
idle_age=121, priority=0,arp,in_port=4 actions=output:3


This works fine and I can establish FTP connections between client and
server. However, when I introduce helper modules, I can no longer connect.
I've narrowed this down to bad TCP checksums.

Tracing through the code, the marking of the skb checksum data in the
he ovs_ct_helper (datapath/conntrack.c) seems to be the cause (only happens
in kernels < 4.6). When the packet is recirculated and upcalled to
user-space, the following code ends up corrupting the TCP checksum:

in datapath.c/queue_userspace_packet()

/* Complete checksum if needed */
if (skb->ip_summed == CHECKSUM_PARTIAL &&
   (err = skb_checksum_help(skb)))

I have verified the checksums before and after this call and it goes from
correct (taking into account the address translation) to invalid.

In this case, it is the SYN packet of the FTP connection that has the bad
checksum so there should be no payload changes involved from the helper.
The checksum is updated correctly from the Netfilter NAT code so should not
need CHECKSUM_PARTIAL set here? This will, obviously, not cover cases where
the packet is modified by the helper.

I will try to dig a bit more into this and report any further information.

Thanks,
John
diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index d942884..18db41b 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -314,6 +314,10 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
 	u8 nexthdr;
 	int err;
 
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
+    struct rtable *rt = NULL;
+#endif
+
 	ct = nf_ct_get(skb, &ctinfo);
 	if (!ct || ctinfo == IP_CT_RELATED_REPLY)
 		return NF_ACCEPT;
@@ -352,43 +356,29 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
 #if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
 	/* Linux 4.5 and older depend on skb_dst being set when recalculating
 	 * checksums after NAT helper has mangled TCP or UDP packet payload.
-	 * This dependency is avoided when skb is CHECKSUM_PARTIAL or when UDP
-	 * has no checksum.
 	 *
-	 * The dependency is not triggered when the main NAT code updates
-	 * checksums after translating the IP header (address, port), so this
-	 * fix only needs to be executed on packets that are both being NATted
-	 * and that have a helper assigned.
+     * skb_dst is cast to a rtable struct and the flags examined.
+     * Forcing these flags to have RTCF_LOCAL set allows checksum calculations
+     * to be carried out in the same way as kernel versions > 4.5
 	 */
 	if (ct->status & IPS_NAT_MASK && skb->ip_summed != CHECKSUM_PARTIAL) {
-		u8 ipproto = (proto == NFPROTO_IPV4)
-			? ip_hdr(skb)->protocol : nexthdr;
-		u16 offset = 0;
-
-		switch (ipproto) {
-		case IPPROTO_TCP:
-			offset = offsetof(struct tcphdr, check);
-			break;
-		case IPPROTO_UDP:
-			/* Skip if no csum. */
-			if (udp_hdr(skb)->check)
-				offset = offsetof(struct udphdr, check);
-			break;
-		}
-		if (offset) {
-			if (unlikely(!pskb_may_pull(skb, protoff + offset + 2)))
-				return NF_DROP;
-
-			skb->csum_start = skb_headroom(skb) + protoff;
-			skb->csum_offset = offset;
-			skb->ip_summed = CHECKSUM_PARTIAL;
-		}
+        rt = kmalloc(sizeof(struct rtable), GFP_KERNEL);
+        rt->rt_flags = RTCF_LOCAL;
+        skb_dst_set(skb, (struct dst_entry *)rt);
 	}
 #endif
+
 	err = helper->help(skb, protoff, ct, ctinfo);
 	if (err != NF_ACCEPT)
 		return err;
 
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
+    if (rt) {
+        skb_dst_set(skb, NULL);
+        kfree(rt);
+    }
+#endif
+
 	/* Adjust seqs after helper.  This is needed due to some helpers (e.g.,
 	 * FTP with NAT) adusting the TCP payload size when mangling IP
 	 * addresses and/or port numbers in the text-based control connection.
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to