Hi Greg, Ben, I doubt we would see a measurable difference in performance, with the additional conditional jump based on the packet flags. That does bring up an interesting question: Shouldn't fragmented packets all hash to the same single flow, and shouldn't the resulting multipath hash value get cached ( for at least 5 secs or so )? Based on our observations it looks like the hash is calculated for each individual fragment, which would be sub-optimal. We would still need to exclude ports for the first fragment, in case some subsequent fragments arrive after the flow entry disappeared - but in theory, the hash could be done once, for the first packet in each flow ( if there is space in the flow cache entry )
In our case, it's not only that packets could get reordered due to taking different paths - the ECMP destinations are end systems ( like an anycast IP ) and reassembly fails because the first packet is sent to one host, and the rest of the fragments to another host. Ben - you are correct that it also applies to TCP and SCTP in theory, just that you won't typically see fragments for those protocols. I'll prepare a formal patch to fix it for all protocols Regards, Jeroen -----Original Message----- From: Ben Pfaff <[email protected]> Sent: Wednesday, June 5, 2019 3:18 PM To: Gregory Rose <[email protected]> Cc: Van Bemmel, Jeroen (Nokia - US) <[email protected]>; [email protected] Subject: Re: [ovs-dev] Fix for hashing fragmented UDP packets On Fri, May 31, 2019 at 01:18:53PM -0700, Gregory Rose wrote: > > On 5/30/2019 10:37 PM, Van Bemmel, Jeroen (Nokia - US) wrote: > > Hello all, > > > > Back in 2015 I submitted some code to improve ECMP hashing > > algorithms used in OVS, see > > https://mail.openvswitch.org/pipermail/ovs-dev/2015-July/300748.html > > > > We have now found an issue with the hashing of fragmented UDP > > packets: The first packet contains the UDP ports, but the rest of > > the fragments do not - so subsequent packets may hash to a different > > destination. This causes IMS SIP calls using UDP to fail ( for > > example ) > > > > To fix this, we need a 1-line patch at > > https://github.com/openvswitch/ovs/blob/master/lib/flow.c#L2374: > > Old: (inc_udp_ports && flow->nw_proto == IPPROTO_UDP)) { > > New: (inc_udp_ports && flow->nw_proto == IPPROTO_UDP && > > !(flow->nw_frag & FLOW_NW_FRAG_MASK)) { > > > > And something similar at > > https://github.com/openvswitch/ovs/blob/master/lib/flow.c#L2489 > > Old: if (is_ip_any(flow) && flow->nw_proto == IPPROTO_UDP) { > > New: if (is_ip_any(flow) && flow->nw_proto == IPPROTO_UDP && > > !(flow->nw_frag & FLOW_NW_FRAG_MASK)) { > > > > In other words: Don't include the ports for the first fragment > > > > Comments or objections? > > I guess if there were a lot of UDP flows with fragmented traffic then > it would slow down the lookups. I don't understand that statement. Why would changing the ECMP hash value slow down lookups? (What lookups?) Oh, are you thinking of this as a hash function for use in hash tables? I do not think that this particular hash function is used that way. > I don't believe that there is a lot of fragmented UDP traffic in most > scenarios, although I'm more familiar with datacenter than telco. It > seems like a reasonable approach but if we could get some sort of > regression testing done to show how much of an impact it might have in > scenarios with high numbers of UDP fragmented connections have then I > think that might make us more comfortable with the change. It would have the effect of putting all the fragmented UDP traffic into a single ECMP hash. If all the traffic were fragmented UDP, that would waste a lot of capacity, but the traffic would not be reordered. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
