From: Tonghao Zhang <xiangxia.m....@gmail.com> Date: Mon, 17 Jul 2017 23:28:06 -0700
> When calling the flow_free() to free the flow, we call many times > (cpu_possible_mask, eg. 128 as default) cpumask_next(). That will > take up our CPU usage if we call the flow_free() frequently. > When we put all packets to userspace via upcall, and OvS will send > them back via netlink to ovs_packet_cmd_execute(will call flow_free). > > The test topo is shown as below. VM01 sends TCP packets to VM02, > and OvS forward packtets. When testing, we use perf to report the > system performance. > > VM01 --- OvS-VM --- VM02 > > Without this patch, perf-top show as below: The flow_free() is > 3.02% CPU usage. > > 4.23% [kernel] [k] _raw_spin_unlock_irqrestore > 3.62% [kernel] [k] __do_softirq > 3.16% [kernel] [k] __memcpy > 3.02% [kernel] [k] flow_free > 2.42% libc-2.17.so [.] __memcpy_ssse3_back > 2.18% [kernel] [k] copy_user_generic_unrolled > 2.17% [kernel] [k] find_next_bit > > When applied this patch, perf-top show as below: Not shown on > the list anymore. > > 4.11% [kernel] [k] _raw_spin_unlock_irqrestore > 3.79% [kernel] [k] __do_softirq > 3.46% [kernel] [k] __memcpy > 2.73% libc-2.17.so [.] __memcpy_ssse3_back > 2.25% [kernel] [k] copy_user_generic_unrolled > 1.89% libc-2.17.so [.] _int_malloc > 1.53% ovs-vswitchd [.] xlate_actions > > With this patch, the TCP throughput(we dont use Megaflow Cache > + Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec > (maybe ~10% performance imporve). > > This patch adds cpumask struct, the cpu_used_mask stores the cpu_id > that the flow used. And we only check the flow_stats on the cpu we > used, and it is unncessary to check all possible cpu when getting, > cleaning, and updating the flow_stats. Adding the cpu_used_mask to > sw_flow struct does’t increase the cacheline number. > > Signed-off-by: Tonghao Zhang <xiangxia.m....@gmail.com> > Acked-by: Pravin B Shelar <pshe...@ovn.org> Applied.