> Date: Tue, 8 Jul 2014 09:22:41 +0200 (CEST) > From: Stefan Fritsch <[email protected]> > > Hi, > > I have been trying to increase fork performance of openbsd/amd64 on KVM. > It turns out that when I increase the number of CPUs of a VM from 1 to 3, > a fork+exit micro benchmark is slowed down by a factor of 7. > > The main reason for this seems to be a very large number of cross-CPU TLB > flushes (about 4 per fork+exit). Each IPI causes several VM exits which > are expensive. To reduce this, I have been trying to use paravirtualized > interfaces provided by KVM and optimize some other things. These changes > are mostly activated by a new pseudo device paravirt (which has the > advantage that one can use UKC to tweak things without recompiling). > However, some changes will remain if not running on a hypervisor (or > paravirt is disabled). For example, x86_ipi() will use a pointer to > dispatch to the appropriate implementation. > > Is this the way to go forward? Or would you rather prefer a compile time > option and maybe ship a bsd.mp.paravirt kernel in addition to bsd+bsd.mp?
Are these paravirtualization APIs stable? Are they (properly) documented somewhere? If we're serious about supporting OpenBSD on (KVM) hypervisors, something like this makes sense. We tend to try and have a single kernel that runs on the widest range of hardware that is possible. For example the OpenBSD/sparc64 kernel runs on both sun4u and sun4v hardware, and the sun4v platforms has written paravirtualization all over it. There I successfully made use of code patching techniques. That might help on x86 as well. Can't say I'm happy with making the interrupt handling code even more complicated though...
