Public bug reported:

[Environment]

Reproducible <= 4.12 that means all our supported series + HWE
(including edge).

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial

Linux 4.10.0-33-generic #37~16.04.1-Ubuntu SMP Fri Aug 11 14:07:24 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux

[Description]

We've identified a constant high (~90%) system time load at the host level
when a VCPU in a KVM guest remains or switches/resumes in/from halt/idle state
in a constant frequency, usually for a slightly smaller time than the default 
polling
period. 

The halt polling mechanism has the intention to reduce latency in the cases 
on which the guest is quickly resumed saving a call to the scheduler. 

We've performed some testing by adjusting the 
/sys/module/kvm/parameters/halt_poll_ns 
value which defines the max time that should be spend polling before calling 
the 
scheduler to allow it to run other tasks (which defaults to 400000 ns in 
Ubuntu). 

With the default value the tests shows that the load remains nearly on 90% on a
VCPU that has a single task in the run queue. 

We've also tested altering the halt_poll_ns value to 200000 ns and the results
seems to drop the system time usage from 90% to ~25%.

root@buneary:/home/ubuntu/trace# echo 200000 > 
/sys/module/kvm/parameters/halt_poll_ns
root@buneary:/home/ubuntu/trace# sudo mpstat 1 -P 6 5
Linux 4.10.0-33-generic (buneary) 10/16/2017 _x86_64_   (56 CPU)

02:33:59 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
[...]
02:34:03 PM 6 4.04 0.00 24.24 0.00 0.00 0.00 0.00 12.12 0.00 59.60
02:34:04 PM 6 2.97 0.00 27.72 0.00 0.00 0.00 0.00 11.88 0.00 57.43
Average: 6 2.45 0.00 25.97 0.00 0.00 0.00 0.00 12.07 0.00 59.51

root@buneary:/home/ubuntu/trace# echo 400000 > 
/sys/module/kvm/parameters/halt_poll_ns
root@buneary:/home/ubuntu/trace# sudo mpstat 1 -P 6 5
Linux 4.10.0-33-generic (buneary) 10/16/2017 _x86_64_   (56 CPU)

02:34:08 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:34:09 PM 6 1.94 0.00 92.23 0.00 0.00 0.00 0.00 3.88 0.00 1.94
[...]
Average: 6 1.38 0.00 89.74 0.00 0.00 0.00 0.00 7.30 0.00 1.58

[Reproducer]

1) Configure a KVM guest with a single pinned VCPU.
2) Run the following program (http://pastebin.ubuntu.com/25731919/) at the KVM 
guest.
$ gcc test.c -lpthread -o test && ./test 250 0
3) Run mpstat at the host on the pinned CPU and compare the stats
$ sudo mpstat 1 -P 6 5

[Fix]

Change the halt polling max time to half of the current value.

In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to
increase heavily even in cases where the performance improvement was
small.  In particular, bandwidth divided by CPU usage was as much as
60% lower.

To some extent this is the expected effect of the patch, and the
additional CPU utilization is only visible when running the
benchmarks.  However, halving the threshold also halves the extra
CPU utilization (from +30-130% to +20-70%) and has no negative
effect on performance.

Signed-off-by: Paolo Bonzini <[email protected]>

*
https://github.com/torvalds/linux/commit/b401ee0b85a53e89739ff68a5b1a0667d664afc9

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: sts

** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1724614

Title:
  [KVM] Lower the default for halt_poll_ns to 200000 ns

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1724614/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to