Jesse: I modified the source port hashing for the VXLAN patch I submitted a few days ago, but I've noticed when using the upstream source port hashing routine, performance drops off by 3.5 times when using iperf between two VMs. From what I can tell, it has to be that all skbuffs coming into the VXLAN tunnel have not already had their rxhash set, and this function is what's killing performance. Let me share the details:
If I use this source port hashing function and call it each time build_header() is called, performance runs at the below: static u16 get_src_port(const struct sk_buff *skb, const struct tnl_mutable_config *mutable) { unsigned int range = (VXLAN_SRC_PORT_MAX - VXLAN_SRC_PORT_MIN) + 1; u32 hash; hash = skb_get_rxhash(skb); if (!hash) hash = jhash(skb->data, 2 * ETH_ALEN, (__force u32) skb->protocol); return (((u64) hash * range) >> 32) + VXLAN_SRC_PORT_MIN; } [root@linux-br ~]# iperf -c 10.1.2.14 ------------------------------------------------------------ Client connecting to 10.1.2.14, TCP port 5001 TCP window size: 22.9 KByte (default) ------------------------------------------------------------ [ 3] local 10.1.2.13 port 60184 connected with 10.1.2.14 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 255 MBytes 214 Mbits/sec [root@linux-br ~]# Now, if I replace it with the following source port hashing function (notice that I'm checking if rehash is set specifically instead of calling skb_get_rxhash(), which computes it if it's not set), performance increases. static u16 get_src_port(const struct sk_buff *skb, const struct tnl_mutable_config *mutable) { unsigned int range = (VXLAN_SRC_PORT_MAX - VXLAN_SRC_PORT_MIN) + 1; u32 hash; if (!skb->rxhash) return (__force __be16)OVS_CB(skb)->flow->hash | htons(VXLAN_SRC_PORT_MIN); else { hash = skb_get_rxhash(skb); return (((u64) hash * range) >> 32) + VXLAN_SRC_PORT_MIN; } } [root@linux-br ~]# iperf -c 10.1.2.14 ------------------------------------------------------------ Client connecting to 10.1.2.14, TCP port 5001 TCP window size: 22.9 KByte (default) ------------------------------------------------------------ [ 3] local 10.1.2.13 port 40839 connected with 10.1.2.14 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 815 MBytes 683 Mbits/sec [root@linux-br ~]# This is consistently reproducible. This leads me to believe the skbuff->rxhash is not computed for each packet, and computing it per-packet is very costly. I'm further stymied as to how this works in the upstream VXLAN and achieves acceptable performance. Is it such the skbuff->rehash is computed somewhere else upstream? It was clear to me where that would be based on looking at the upstream vxlan driver. Thanks, Kyle _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev