Hi,
Sorry it took so slow to reply you - there are only few moments when I
can poke a production server and I need to notify people in advance
about that :(
> Can you post kvm_stat output while slowness is happening? 'perf top' on the
> host? and on the guest?
I took 'perf top' and first thing I saw is that while guest is on
acpi_pm, it shows more or less normal amount of IRQs (under 1000/s),
however when I switched back to the default (which is nohz with
kvm_clock), there are 40 times (!!!) more IRQs under normal operation
(about 40 000/s). When the slowdown is happening, there are a lot of
_spin_lock events and a lot of messages like: "WARNING: failed to keep
up with mmap data. Last read 810 msecs ago."
As I told before, switching to acpi_pm does not save the day, but
makes situation a lot more workable (i.e., servers recover faster from
the period of slowness). During slowdowns on acpi_pm I also see
"_spin_lock"
Raw data follows:
vmstat -5 on the host:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 131904 13952 205872 0 0 0 24 2495 9813 6 3 91 0
0 0 0 132984 13952 205872 0 0 0 47 2596 9851 5 3 91 1
1 0 0 132148 13952 205872 0 0 0 54 2644 10559 3 3 93 1
0 1 0 129084 13952 205872 0 0 0 38 3039 9752 7 3 87 2
6 0 0 126388 13952 205872 0 0 0 311 15619 9009 42 17 39 2
9 0 0 125868 13960 205872 0 0 6 86 4659 6504 98 2 0 0
8 0 0 123320 13960 205872 0 0 0 26 4682 6649 98 2 0 0
8 0 0 126252 13960 205872 0 0 0 124 4923 6776 98 2 0 0
8 0 0 125376 13960 205872 0 0 136 11 4287 5865 98 2 0 0
9 0 0 123812 13960 205872 0 0 205 51 4497 6134 98 2 0 0
8 0 0 126020 13960 205872 0 0 904 26 4483 5999 98 2 0 0
8 0 0 124052 13960 205872 0 0 15 10 4397 6200 98 2 0 0
8 0 0 125928 13960 205872 0 0 14 41 4335 5823 98 2 0 0
8 0 0 126184 13960 205872 0 0 6 14 4966 6588 98 2 0 0
8 0 0 123588 13960 205872 0 0 143 18 5234 6891 98 2 0 0
8 0 0 126640 13960 205872 0 0 6 91 5554 7334 98 2 0 0
8 0 0 123144 13960 205872 0 0 146 11 5235 7145 98 2 0 0
8 0 0 125856 13968 205872 0 0 1282 98 5481 7159 98 2 0 0
9 19 0 124124 13968 205872 0 0 782 2433 8587 8987 97 3 0 0
8 0 0 122584 13968 205872 0 0 432 90 5359 6960 98 2 0 0
8 0 0 125320 13968 205872 0 0 3074 52 5448 7095 97 3 0 0
8 0 0 121436 13968 205872 0 0 2519 81 5714 7279 98 2 0 0
8 0 0 124436 13968 205872 0 0 1 56 5242 6864 98 2 0 0
8 0 0 111324 13968 205872 0 0 2 22 10660 6686 97 3 0 0
8 0 0 107824 13968 205872 0 0 0 24 14329 8147 97 3 0 0
8 0 0 110420 13968 205872 0 0 0 68 13486 6985 98 2 0 0
8 0 0 110024 13968 205872 0 0 0 19 13085 6659 98 2 0 0
8 0 0 109932 13968 205872 0 0 0 3 12952 6415 98 2 0 0
8 0 0 108552 13968 205880 0 0 2 41 13400 7349 98 2 0 0
Few shots with kvm_stat on the host:
Every 2.0s: kvm_stat -1
Wed Dec 1 04:45:47 2010
efer_reload 0 0
exits 56264102 14074
fpu_reload 311506 50
halt_exits 4733166 935
halt_wakeup 3845079 840
host_state_reload 8795964 4085
hypercalls 0 0
insn_emulation 13573212 7249
insn_emulation_fail 0 0
invlpg 1846050 20
io_exits 3579406 843
irq_exits 3038887 4879
irq_injections 5242157 3681
irq_window 124361 540
largepages 2253 0
mmio_exits 64274 20
mmu_cache_miss 664011 16
mmu_flooded 164506 1
mmu_pde_zapped 212686 8
mmu_pte_updated 729268 0
mmu_pte_write 81323616 551
mmu_recycled 277 0
mmu_shadow_zapped 652691 23
mmu_unsync 5630 8
nmi_injections 0 0
nmi_window 0 0
pf_fixed 17470658 218
pf_guest 13352205 81
remote_tlb_flush 1898930 96
request_irq 0 0
signal_exits 0 0
tlb_flush 5827433 108
Every 2.0s: kvm_stat -1
Wed Dec 1 04:47:33 2010
efer_reload 0 0
exits 58155746 18954
fpu_reload 318003 61
halt_exits 4839340 1082
halt_wakeup 3940964 984
host_state_reload 9267420 4803
hypercalls 0 0
insn_emulation 14376685 7721
insn_emulation_fail 0 0
invlpg 1855758 13
io_exits 3676471 993
irq_exits 3609310 5363
irq_injections 5648007 3922
irq_window 181397 517
largepages 2253 0
mmio_exits 65862 14
mmu_cache_miss 666017 2
mmu_flooded 164784 0
mmu_pde_zapped 213208 1
mmu_pte_updated 731301 1
mmu_pte_write 81455666 14
mmu_recycled 277 0
mmu_shadow_zapped 653910 1
mmu_unsync 5461 -2
nmi_injections 0 0
nmi_window 0 0
pf_fixed 17530360 154
pf_guest 13388143 89
remote_tlb_flush 1915787 39
request_irq 0 0
signal_exits 0 0
tlb_flush 5857307 68
Every 2.0s: kvm_stat -1
Wed Dec 1 04:47:46 2010
efer_reload 0 0
exits 58382002 14542
fpu_reload 318544 32
halt_exits 4850647 541
halt_wakeup 3951056 441
host_state_reload 9317479 2669
hypercalls 0 0
insn_emulation 14464287 5075
insn_emulation_fail 0 0
invlpg 1856481 34
io_exits 3686632 456
irq_exits 3670192 3742
irq_injections 5692201 2471
irq_window 186987 126
largepages 2253 0
mmio_exits 65981 6
mmu_cache_miss 666184 14
mmu_flooded 164819 1
mmu_pde_zapped 213264 5
mmu_pte_updated 731432 2
mmu_pte_write 81473978 563
mmu_recycled 277 0
mmu_shadow_zapped 654130 10
mmu_unsync 5410 4
nmi_injections 0 0
nmi_window 0 0
pf_fixed 17536667 653
pf_guest 13391345 300
remote_tlb_flush 1917634 120
request_irq 0 0
signal_exits 0 0
tlb_flush 5860221 205
'perf top' on the host:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
PerfTop: 3894 irqs/sec kernel: 2.4% exact: 0.0% [1000Hz
cycles], (all, 4 CPUs)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ________________________
______________________________________________________________
109.00 15.1% vmx_vcpu_run
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko
64.00 8.9% copy_user_generic_string [kernel.kallsyms]
61.00 8.4% __ticket_spin_lock [kernel.kallsyms]
47.00 6.5% vcpu_enter_guest
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
23.00 3.2% schedule [kernel.kallsyms]
18.00 2.5% kvm_read_guest
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
18.00 2.5% gfn_to_hva
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
17.00 2.4% native_write_msr_safe [kernel.kallsyms]
17.00 2.4% x86_decode_insn
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
13.00 1.8% rtl8169_interrupt
/lib/modules/2.6.35-23-server/kernel/drivers/net/r8169.ko
13.00 1.8% native_read_msr_safe [kernel.kallsyms]
11.00 1.5% paging64_walk_addr
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
10.00 1.4% vmcs_writel
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko
10.00 1.4% PyEval_EvalFrameEx /usr/bin/python2.6
10.00 1.4% update_curr [kernel.kallsyms]
10.00 1.4% x86_emulate_insn
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
9.00 1.2% vmx_get_cpl
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko
9.00 1.2% __srcu_read_lock [kernel.kallsyms]
9.00 1.2% emulate_instruction
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
9.00 1.2% __memcpy [kernel.kallsyms]
9.00 1.2% handle_exception
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko
8.00 1.1% vmx_complete_interrupts
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko
8.00 1.1% native_read_tsc [kernel.kallsyms]
8.00 1.1% __vcpu_run
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko
7.00 1.0% fput [kernel.kallsyms]
7.00 1.0% __switch_to [kernel.kallsyms]
6.00 0.8% vmx_cache_reg
/lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko
6.00 0.8% _raw_spin_lock_irqsave [kernel.kallsyms]
'perf top' on guest:
------------------------------------------------------------------------------
PerfTop: 13547 irqs/sec kernel:99.4% [100000 cpu-clock-msecs],
(all, 4 CPUs)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
119287.00 - 94.8% : _spin_lock
1108.00 - 0.9% : do_page_fault
599.00 - 0.5% : _spin_unlock_irqrestore
512.00 - 0.4% : finish_task_switch
432.00 - 0.3% : clear_page_c
361.00 - 0.3% : __do_softirq
214.00 - 0.2% : native_flush_tlb
158.00 - 0.1% : native_set_pte_at
147.00 - 0.1% : flush_tlb_page
141.00 - 0.1% : retint_careful
101.00 - 0.1% : kmem_cache_alloc
95.00 - 0.1% : get_page_from_freelist
94.00 - 0.1% : unmap_vmas
82.00 - 0.1% : native_set_pmd
74.00 - 0.1% : virtnet_poll [virtio_net]
WARNING: failed to keep up with mmap data. Last read 810 msecs ago.
WARNING: failed to keep up with mmap data. Last read 0 msecs ago.
WARNING: failed to keep up with mmap data. Last read 0 msecs ago.
WARNING: failed to keep up with mmap data. Last read 0 msecs ago.
'perf top' on the guest under normal operation (nohz, clocksource=kvm_clock):
------------------------------------------------------------------------------
PerfTop: 39501 irqs/sec kernel:99.8% [100000 cpu-clock-msecs],
(all, 4 CPUs)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
256987.00 - 98.7% : native_safe_halt
377.00 - 0.1% : __do_softirq
332.00 - 0.1% : finish_task_switch
257.00 - 0.1% : do_page_fault
201.00 - 0.1% : _spin_unlock_irqrestore
121.00 - 0.0% : tick_nohz_stop_sched_tick
113.00 - 0.0% : pvclock_clocksource_read
106.00 - 0.0% : flush_tlb_page
99.00 - 0.0% : tick_nohz_restart_sched_tick
59.00 - 0.0% : system_call_after_swapgs
55.00 - 0.0% : fget_light
52.00 - 0.0% : kmem_cache_alloc
52.00 - 0.0% : do_sys_poll
50.00 - 0.0% : perf_poll
47.00 - 0.0% : native_flush_tlb
'perf top' on the guest under normal operation (with
clocksource=acpi_pm nohz=off highres=off):
------------------------------------------------------------------------------
PerfTop: 949 irqs/sec kernel:89.3% [100000 cpu-clock-msecs],
(all, 4 CPUs)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
10196.00 - 92.3% : native_safe_halt
145.00 - 1.3% : clear_page_c
139.00 - 1.3% : do_page_fault
100.00 - 0.9% : acpi_pm_read
52.00 - 0.5% : flush_tlb_page
49.00 - 0.4% : finish_task_switch
32.00 - 0.3% : native_flush_tlb
27.00 - 0.2% : __do_softirq
24.00 - 0.2% : _spin_unlock_irqrestore
22.00 - 0.2% : native_set_pmd
20.00 - 0.2% : native_set_pte_at
19.00 - 0.2% : _spin_lock
18.00 - 0.2% : generic_unplug_device
17.00 - 0.2% : tick_nohz_stop_sched_tick
13.00 - 0.1% : unmap_vmas
'perf top' on the guest when slowdown is happening (with
clocksource=acpi_pm nohz=off highres=off):
------------------------------------------------------------------------------
PerfTop: 966 irqs/sec kernel:94.0% [100000 cpu-clock-msecs],
(all, 4 CPUs)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
10799.00 - 91.2% : native_safe_halt
514.00 - 4.3% : _spin_lock
115.00 - 1.0% : clear_page_c
87.00 - 0.7% : do_page_fault
68.00 - 0.6% : __do_softirq
49.00 - 0.4% : acpi_pm_read
44.00 - 0.4% : _spin_unlock_irqrestore
39.00 - 0.3% : finish_task_switch
23.00 - 0.2% : flush_tlb_page
21.00 - 0.2% : native_flush_tlb
8.00 - 0.1% : native_set_pmd
8.00 - 0.1% : native_set_pte_at
7.00 - 0.1% : tick_nohz_restart_sched_tick
6.00 - 0.1% : tick_nohz_stop_sched_tick
Thanks,
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html