Hi, Jaime Thank you so much for trying this, Jiri Benc is expert on this. So cc him.
Jiri, do we need to calculate checksum before push_nsh if checksum offload is on? -----Original Message----- From: ovs-discuss-boun...@openvswitch.org [mailto:ovs-discuss-boun...@openvswitch.org] On Behalf Of Jaime Caamaño Ruiz Sent: Monday, June 25, 2018 5:45 PM To: jcaam...@suse.com; ovs-discuss@openvswitch.org Subject: Re: [ovs-discuss] Bad checksums observed with nsh encapsulation Hello I looked a bit more into the issue. This is happenning when OVS receives a CHECKSUM_PARTIAL. For a normal vm2vm non nsh scenario, OVS provides the same CHECKSUM_PARTIAL to the receiver which wont then verify the checksum. But when we are pushing nsh headers, the first receiver may not be the final receiver and CHECKSUM_PARTIAL may not reach the final reciever which will then verify and reject a bad checksum. So I think it may be necessary to handle the CHECKSUM_PARTIAL case on nsh_push, something like adding if (skb->ip_summed == CHECKSUM_PARTIAL) { skb_checksum_help(skb); } Tried that and got rid of my problem. Any thoughts? BR Jaime. -----Original Message----- From: Jaime Caamaño Ruiz <jcaam...@suse.de> Reply-To: jcaam...@suse.com To: jcaam...@suse.com, ovs-discuss@openvswitch.org Subject: Re: [ovs-discuss] Bad checksums observed with nsh encapsulation Date: Thu, 14 Jun 2018 18:15:10 +0200 Hello I have done a follow-up test very similar to the previous one, but this time using two computes such that client and server reside in one of them and the vnf on the other one. This means that packets coming from either client/server that are being nsh encapsulated are then forwarded to the vnf compute egressing through a vxlan tunnel port (vxlan+eth+nsh+payload). In this scenario I dont observe the checksum problem. So it is a combination of nsh encasulation + tap port egress when the checksum is sometimes observed to be incorrect. BR Jaime. -----Original Message----- From: Jaime Caamaño Ruiz <jcaam...@suse.de> Reply-To: jcaam...@suse.com To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: [ovs-discuss] Bad checksums observed with nsh encapsulation Date: Wed, 13 Jun 2018 12:51:59 +0200 Hello I am facing a problem where eth+nsh encapsulated packets egress OVS with incorrect checksum. The scenario is client ---- vnf ---- server all guests on the same host so this is vm2vm traffic, tap ports are directly added to the ovs bridge. TCP traffic from/to server port 80 is encapsulated with eth+nsh and traverse the vnf. I exercise the traffic by using nc both on client and server. I include captures at the client [1] and at the vnf [2] where I attempt three tcp connections on port 80. The general observation is that packets generated on client/server are seen there with wrong checksums due to offloading but then arrive at the vnf with correct checksum. But not all of them. For the first conenction attempt you can see that SYN (frame 74) and ACK (78) are ok, but then FIN (79) is not ok. A retransmitted FIN (80) is still not ok and then a further FIN (93) retranmission is ok. Much of the same happens for the second attempt. The third attempt shows a bad SYN (104) coming from the server. Two additional observations: - This does not happen if I try the same on a port different than 80 so that the traffic goes directly from the client to the server with no eth+nsh encapsulation. - This does not happen if I disable tx offloading both in the server and the client. I include also the flows [3] and the ofproto trace [4] for the FIN (79), generated by the client, which is eth+nsh encapsulated and forwarded to the vnf. The decision on whether packet should be eth+nsh encapsulated or no happens on table 101 by setting reg2 which is then checked on 221. Packet is nsh encapsulated on table 222 and then ethernet encapsulated on table 83. If not encapsulated packet would go from 221 back to 220 and output there without any further actions. Using OVS 2.9.2 with OVS tree kernel module. Kernel is 4.4. I am understanding the problem correctly in regards to OVS being responsible for these checksums when offloading is enabled? Any pointers on how I can debug this further? Why would just some of the eth+nsh packets exhibit this problem and not all? Why would these bad packets be ok after retransmissions? [1] https://filebin.net/8mnypc2qm4vninof/client.pcap?t=b097kh0m [2] https://filebin.net/8mnypc2qm4vninof/vnf_eth0.pcap?t=b097kh0m [3] https://hastebin.com/nuhexufaze.sql [4] https://hastebin.com/yevufanula.http Thanks for your help, Jaime. _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss