> > > > Upfront note - that RFC is not a complete patch. > > > > It introduces an ABI breakage, plus it doesn't update ring_elem code > > > > properly, etc. > > > > I plan to deal with all these things in later versions. > > > > Right now I seek an initial feedback about proposed ideas. > > > > Would also ask people to repeat performance tests (see below) on > > > > their platforms to confirm the impact. > > > > > > > > More and more customers use(/try to use) DPDK based apps within > > > > overcommitted systems (multiple acttive threads over same pysical > > > > cores): > > > > VM, container deployments, etc. > > > > One quite common problem they hit: Lock-Holder-Preemption with > > rte_ring. > > > > LHP is quite a common problem for spin-based sync primitives > > > > (spin-locks, etc.) on overcommitted systems. > > > > The situation gets much worse when some sort of fair-locking > > > > technique is used (ticket-lock, etc.). > > > > As now not only lock-owner but also lock-waiters scheduling order > > > > matters a lot. > > > > This is a well-known problem for kernel within VMs: > > > > http://www-archive.xenproject.org/files/xensummitboston08/LHP.pdf > > > > https://www.cs.hs-rm.de/~kaiser/events/wamos2017/Slides/selcuk.pdf > These slides seem to indicate that the problems are mitigated through the > Hypervisor configuration. Do we still need to address the issues?
I am not really an expert here, but AFAIK current mitigations deal mostly with guest kernel: linux implements PV version of spinlocks (unfair and/or based on hypercall availability), hypervisor might make decision itself based on is guest in user/kernel mode, plus on some special cpu instructions. We do spin in user-space mode. Might be hypervisors became smarter these days, but so far, I heard about few different customers that hit such problem. As an example, NA DPDK summit presentation: https://dpdkna2019.sched.com/event/WYBG/dpdk-containers-challenges-solutions-wang-yong-zte page 16 (problem #4) describes same issue. > > > > > The problem with rte_ring is that while head accusion is sort of > > > > un-fair locking, waiting on tail is very similar to ticket lock > > > > schema - tail has to be updated in particular order. > > > > That makes current rte_ring implementation to perform really pure on > > > > some overcommited scenarios. > > > > > > Rather than reform rte_ring to fit this scenario, it would make more > > > sense to me to introduce another primitive. The current lockless ring > > > performs very well for the isolated thread model that DPDK was built > > > around. This looks like a case of customers violating the usage model > > > of the DPDK and then being surprised at the fallout. > > > > I agree with Stephen here. > > > > I think, adding more runtime check in the enqueue() and dequeue() will have > > a > > bad effect on the low-end cores too. > > But I agree with the problem statement that in the virtualization use case, > > It > > may be possible to have N virtual cores runs on a physical core. > It is hard to imagine that there are data plane applications deployed in such > environments. Wouldn't this affect the performance terribly? It wouldn't reach same performance as isolated threads, but for some tasks it might be enough. AFAIK, one quite common scenario - few isolated threads(/processes) doing actual IO and then spread packets over dozens(/hundreds) non-isolated consumers. > > > > > IMO, The best solution would be keeping the ring API same and have a > > different flavor in "compile-time". Something like liburcu did for > > accommodating different flavors. > > > > i.e urcu-qsbr.h and urcu-bp.h will identical definition of API. The > > application > > can simply include ONE header file in a C file based on the flavor. > > If need both at runtime. Need to have function pointer or so in the > > application > > and define the function in different c file by including the approaite > > flavor in C > > file. > > > > #include <urcu-qsbr.h> /* QSBR RCU flavor */ #include <urcu-bp.h> /* > > Bulletproof RCU flavor */