Am 10.09.2015 um 03:55 schrieb Wanpeng Li: > On 9/9/15 9:39 PM, Christian Borntraeger wrote: >> Am 03.09.2015 um 16:07 schrieb Wanpeng Li: >>> v6 -> v7: >>> * explicit signal (set a bool) >>> * fix the tracepoint >>> >>> v5 -> v6: >>> * fix wait_ns and poll_ns >>> >>> v4 -> v5: >>> * set base case 10us and max poll time 500us >>> * handle short/long halt, idea from David, many thanks David >>> >>> v3 -> v4: >>> * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks >>> when idle VCPU is detected >>> >>> v2 -> v3: >>> * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or >>> /halt_poll_ns_shrink >>> * drop the macros and hard coding the numbers in the param definitions >>> * update the comments "5-7 us" >>> * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns >>> time, >>> vcpu->halt_poll_ns start at zero >>> * drop the wrappers >>> * move the grow/shrink logic before "out:" w/ "if (waited)" >>> >>> v1 -> v2: >>> * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of >>> the module parameter >>> * use the shrink/grow matrix which is suggested by David >>> * set halt_poll_ns_max to 2ms >>> >>> There is a downside of always-poll since poll is still happened for idle >>> vCPUs which can waste cpu usage. This patchset add the ability to adjust >>> halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected, >>> and to shrink halt_poll_ns when long halt is detected. >>> >>> There are two new kernel parameters for changing the halt_poll_ns: >>> halt_poll_ns_grow and halt_poll_ns_shrink. >>> >>> no-poll always-poll dynamic-poll >>> ----------------------------------------------------------------------- >>> Idle (nohz) vCPU %c0 0.15% 0.3% 0.2% >>> Idle (250HZ) vCPU %c0 1.1% 4.6%~14% 1.2% >>> TCP_RR latency 34us 27us 26.7us >>> >>> "Idle (X) vCPU %c0" is the percent of time the physical cpu spent in >>> c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the >>> guest was tickless. (250HZ) means the guest was ticking at 250HZ. >>> >>> The big win is with ticking operating systems. Running the linux guest >>> with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close >>> to no-polling overhead levels by using the dynamic-poll. The savings >>> should be even higher for higher frequency ticks. >>> >>> Wanpeng Li (3): >>> KVM: make halt_poll_ns per-vCPU >>> KVM: dynamic halt-polling >>> KVM: trace kvm_halt_poll_ns grow/shrink >>> >>> include/linux/kvm_host.h | 1 + >>> include/trace/events/kvm.h | 30 +++++++++++++++++++ >>> virt/kvm/kvm_main.c | 72 >>> ++++++++++++++++++++++++++++++++++++++++++---- >>> 3 files changed, 97 insertions(+), 6 deletions(-) >>> >> I get some nice improvements for uperf between 2 guests, > > Good to hear that. > >> but there is one "bug": >> If there is already some polling ongoing, its impossible to disable the >> polling, > > The polling will stop if long halt is detected, and there is no need to > manual tuning. Just like dynamise PLE window can detect false positive and > handle ple window suitably.
Yes, but as soon as somebody sets halt_poll_ns to 0, polling will never stop, as grow and shrink are only handled if halt_poll_ns is !=0. [...] if (halt_poll_ns) { if (block_ns <= vcpu->halt_poll_ns) ; /* we had a long block, shrink polling */ else if (vcpu->halt_poll_ns && block_ns > halt_poll_ns) shrink_halt_poll_ns(vcpu); /* we had a short halt and our poll time is too small */ else if (vcpu->halt_poll_ns < halt_poll_ns && block_ns < halt_poll_ns) grow_halt_poll_ns(vcpu); } [...] so maybe just do something like diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4662a88..48828d6 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2012,6 +2012,8 @@ out: else if (vcpu->halt_poll_ns < halt_poll_ns && block_ns < halt_poll_ns) grow_halt_poll_ns(vcpu); + } else { + vcpu->halt_poll_ns = 0; } trace_kvm_vcpu_wakeup(block_ns, waited); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html