Issue:
The src-port for UDP is based on RSS hash in the packet metadata.
In case of packets coming from VM it will be 5-tuple, if available,
otherwise just IP addresses. If the VM fragments a large IP packet
and sends the fragments to OVS, only the first fragment will contain
the L4 header.  Therefore, the first fragment and subsequent fragments
get different UDP src ports in the outgoing VXLAN header. This can
lead to fragment re-ordering in the fabric as packet will take
different paths.

Fix:
With this patch, we ignore the L4 header during hash calculation in
the case of fragmented packets.

Signed-off-by: Parvathy Tarur Ramachandran 
<[email protected]>
---
 lib/flow.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/flow.c b/lib/flow.c
index cc1b3f2..38bf377 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -2178,7 +2178,7 @@ miniflow_hash_5tuple(const struct miniflow *flow, 
uint32_t basis)
 
     if (flow) {
         ovs_be16 dl_type = MINIFLOW_GET_BE16(flow, dl_type);
-        uint8_t nw_proto;
+        uint8_t nw_proto, nw_frag;
 
         if (dl_type == htons(ETH_TYPE_IPV6)) {
             struct flowmap map = FLOWMAP_EMPTY_INITIALIZER;
@@ -2200,6 +2200,14 @@ miniflow_hash_5tuple(const struct miniflow *flow, 
uint32_t basis)
 
         nw_proto = MINIFLOW_GET_U8(flow, nw_proto);
         hash = hash_add(hash, nw_proto);
+        /* Skip l4 header fields if IP packet is fragmented since
+         * only first fragment will carry l4 header.
+         */
+        nw_frag = MINIFLOW_GET_U8(flow, nw_frag);
+        if (nw_frag) {
+            goto out;
+        }
+
         if (nw_proto != IPPROTO_TCP && nw_proto != IPPROTO_UDP
             && nw_proto != IPPROTO_SCTP && nw_proto != IPPROTO_ICMP
             && nw_proto != IPPROTO_ICMPV6) {
-- 
2.7.4

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to