On 12/10/19 12:37 PM, Matteo Croce wrote: > On Tue, Dec 10, 2019 at 8:13 PM David Ahern <[email protected]> wrote: >> >> Hi Matteo: >> >> On a hypervisor running a 4.14.91 kernel and OVS 2.11 I am seeing a >> thundering herd wake up problem. Every packet punted to userspace wakes >> up every one of the handler threads. On a box with 96 cpus, there are 71 >> handler threads which means 71 process wakeups for every packet punted. >> >> This is really easy to see, just watch sched:sched_wakeup tracepoints. >> With a few extra probes: >> >> perf probe sock_def_readable sk=%di >> perf probe ep_poll_callback wait=%di mode=%si sync=%dx key=%cx >> perf probe __wake_up_common wq_head=%di mode=%si nr_exclusive=%dx >> wake_flags=%cx key=%8 >> >> you can see there is a single netlink socket and its wait queue contains >> an entry for every handler thread. >> >> This does not happen with the 2.7.3 version. Roaming commits it appears >> that the change in behavior comes from this commit: >> >> commit 69c51582ff786a68fc325c1c50624715482bc460 >> Author: Matteo Croce <[email protected]> >> Date: Tue Sep 25 10:51:05 2018 +0200 >> >> dpif-netlink: don't allocate per thread netlink sockets >> >> >> Is this a known problem? >> >> David >> > > Hi David, > > before my patch, vswitchd created NxM sockets, being N the ports and M > the active cores, > because every thread opens a netlink socket per port. > > With my patch, a pool is created with N socket, one per port, and all > the threads polls the same list > with the EPOLLEXCLUSIVE flag. > As the name suggests, EPOLLEXCLUSIVE lets the kernel wakeup only one > of the waiting threads. > > I'm not aware of this problem, but it goes against the intended > behaviour of EPOLLEXCLUSIVE. > Such flag exists since Linux 4.5, can you check that it's passed > correctly to epoll()? >
I get the theory, but the reality is that all threads are awakened. Also, it is not limited to the 4.14 kernel; I see the same behavior with 5.4. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
