Issue: The src-port for UDP is based on RSS hash in the packet metadata. In case of packets coming from VM it will be 5-tuple, if available, otherwise just IP addresses. If the VM fragments a large IP packet and sends the fragments to OVS, only the first fragment will contain the L4 header. Therefore, the first fragment and subsequent fragments get different UDP src ports in the outgoing VXLAN header. This can lead to fragment re-ordering in the fabric as packet will take different paths.
Fix: With this patch, we ignore the L4 header during hash calculation in the case of fragmented packets. Signed-off-by: Parvathy Tarur Ramachandran <[email protected]> --- lib/flow.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/flow.c b/lib/flow.c index cc1b3f2..38bf377 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -2178,7 +2178,7 @@ miniflow_hash_5tuple(const struct miniflow *flow, uint32_t basis) if (flow) { ovs_be16 dl_type = MINIFLOW_GET_BE16(flow, dl_type); - uint8_t nw_proto; + uint8_t nw_proto, nw_frag; if (dl_type == htons(ETH_TYPE_IPV6)) { struct flowmap map = FLOWMAP_EMPTY_INITIALIZER; @@ -2200,6 +2200,14 @@ miniflow_hash_5tuple(const struct miniflow *flow, uint32_t basis) nw_proto = MINIFLOW_GET_U8(flow, nw_proto); hash = hash_add(hash, nw_proto); + /* Skip l4 header fields if IP packet is fragmented since + * only first fragment will carry l4 header. + */ + nw_frag = MINIFLOW_GET_U8(flow, nw_frag); + if (nw_frag) { + goto out; + } + if (nw_proto != IPPROTO_TCP && nw_proto != IPPROTO_UDP && nw_proto != IPPROTO_SCTP && nw_proto != IPPROTO_ICMP && nw_proto != IPPROTO_ICMPV6) { -- 2.7.4 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
