Ben,

Looking into the current OVS behavior w.r.t. IP fragments: Based on 
datapath/flow.c function key_extract, it looks like OVS would treat a 
multi-fragment UDP packet as 2 different flows:
1) Unfragmented packets + first fragments for a given 5-tuple
2) Subsequent fragments ( for src/dst ip/proto 3-tuple )

That means the multipath hash calculation is performed twice, not for each 
fragment but for each flow.

For consistency, it could be considered to change this key_extract logic to 
group together fragmented packets ( first + subsequent fragments ) versus 
unfragmented packets instead. 
That way, the first fragment would get the same treatment as subsequent 
fragments ( e.g. logging, hashing, mirroring, DSCP rewriting, whatever people 
do with a flow ). It is very likely that a first fragment is soon followed by a 
next fragment of that same flow, so memory access wise the flow cache would 
benefit from hashing these packets to the same slot.

A potential downside or caveat would be that logic depending on flow port 
matching ( e.g. ACLs ) would no longer match on first fragments, so fragment 
processing further downstream would have to be evaluated.

I would say it's a potentially disruptive change with unclear benefits ( 
although it would fix the fragment hashing issue, in a different way )

Regards,
Jeroen

-----Original Message-----
From: Ben Pfaff <[email protected]> 
Sent: Wednesday, June 5, 2019 3:54 PM
To: Van Bemmel, Jeroen (Nokia - US) <[email protected]>
Cc: Gregory Rose <[email protected]>; [email protected]
Subject: Re: [ovs-dev] Fix for hashing fragmented UDP packets

On Wed, Jun 05, 2019 at 08:34:50PM +0000, Van Bemmel, Jeroen (Nokia - US) wrote:
> Hi Greg, Ben,
> 
> I doubt we would see a measurable difference in performance, with the 
> additional conditional jump based on the packet flags. That does bring 
> up an interesting question: Shouldn't fragmented packets all hash to 
> the same single flow, and shouldn't the resulting multipath hash value 
> get cached ( for at least 5 secs or so )? Based on our observations it 
> looks like the hash is calculated for each individual fragment, which 
> would be sub-optimal.

Hmm.  That *is* suboptimal.  If you figure out anything about why it is not 
doing better, then please do follow up on it.

> We would still need to exclude ports for the first fragment, in case 
> some subsequent fragments arrive after the flow entry disappeared - 
> but in theory, the hash could be done once, for the first packet in 
> each flow ( if there is space in the flow cache entry )

Yes.

> In our case, it's not only that packets could get reordered due to 
> taking different paths - the ECMP destinations are end systems ( like 
> an anycast IP ) and reassembly fails because the first packet is sent 
> to one host, and the rest of the fragments to another host.
> 
> Ben - you are correct that it also applies to TCP and SCTP in theory, 
> just that you won't typically see fragments for those protocols. I'll 
> prepare a formal patch to fix it for all protocols

Great.  Thank you.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to