Re: VMD consumes 100% cpu after unpausing guest
* Dave Voutila[2018-02-27 21:29:25 -0500]: I can confirm this patch resolves the issue I reported. I _think_ I'm seeing a similar CPU load drop as well, but definitely have paused/unpaused the guest multiple times without issues. Thanks Dave and Peter for testing. I will commit this. I cannot explain a general decrease in CPU load because these lines are in the code path only when you unpause or receive a vm. * Peter Hessler [2018-02-27 11:16:52 +0100]: (btw, should rtc_fireper() receive a similar change?) rtc_fireper is unrelated to the cause of this. rtc_reschedule_per will do an event_add for rtc_fireper if required and rtc_fireper keeps on doing an event_add for itself. -- Pratik
Re: VMD consumes 100% cpu after unpausing guest
Peter Hessler <phess...@openbsd.org> writes: > On 2018 Feb 26 (Mon) at 18:52:34 -0800 (-0800), Pratik Vyas wrote: > :* Dave Voutila <d...@sisu.io> [2018-02-22 23:40:21 -0500]: > : > :> > Synopsis: VMD consumes 100% cpu after unpausing guest > :> > Category:amd64 > :> > Environment: > :>System : OpenBSD 6.2 > :>Details : OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 > MST 2018 > :> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > :> > :>Architecture: OpenBSD.amd64 > :>Machine : amd64 > :> > :> > Description: > :> > :>Not sure if this is a known issue, but I couldn't find anything > :> searching the lists. > :> > :> Using an Alpine Linux guest vm, I can successfully pause the guest using > :> `vmctl pause 1` and some time later resume it using `vmctl unpause 1`. > :> > :> Unpausing works as the guest comes back to life, I can SSH back in, and > :> it's fine. However, on the host the vmd process representing that guest > :> sits at 100% CPU utilization with 1 thread constantly queueing onto a > :> cpu and running. The guest reports normal load so it must be one of the > :> 2 threads. > : > :This should fix it. > : > :Use rtc_reschedule_per in mc146818_start instead of re arming the > :periodic interrupt without checking if it's enabled in REGB. > : > :ok? > : > :-- > :Pratik > : > :Index: usr.sbin/vmd/mc146818.c > :=== > :RCS file: /home/pdvyas/cvs/src/usr.sbin/vmd/mc146818.c,v > :retrieving revision 1.15 > :diff -u -p -a -u -r1.15 mc146818.c > :--- usr.sbin/vmd/mc146818.c 9 Jul 2017 00:51:40 - 1.15 > :+++ usr.sbin/vmd/mc146818.c 27 Feb 2018 02:47:18 - > :@@ -354,6 +354,6 @@ mc146818_stop() > :void > :mc146818_start() > :{ > :-evtimer_add(, _tv); > : evtimer_add(, _tv); > :+rtc_reschedule_per(); > :} > : > > This helps a lot with the CPU load on a vmd host. Drops my single guest > from ~50% CPU to ~9% CPU on the host. I can confirm this patch resolves the issue I reported. I _think_ I'm seeing a similar CPU load drop as well, but definitely have paused/unpaused the guest multiple times without issues.
Re: VMD consumes 100% cpu after unpausing guest
On 2018 Feb 26 (Mon) at 18:52:34 -0800 (-0800), Pratik Vyas wrote: :* Dave Voutila <d...@sisu.io> [2018-02-22 23:40:21 -0500]: : :> > Synopsis: VMD consumes 100% cpu after unpausing guest :> > Category: amd64 :> > Environment: :> System : OpenBSD 6.2 :> Details : OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 MST 2018 :> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP :> :> Architecture: OpenBSD.amd64 :> Machine : amd64 :> :> > Description: :> :>Not sure if this is a known issue, but I couldn't find anything :> searching the lists. :> :> Using an Alpine Linux guest vm, I can successfully pause the guest using :> `vmctl pause 1` and some time later resume it using `vmctl unpause 1`. :> :> Unpausing works as the guest comes back to life, I can SSH back in, and :> it's fine. However, on the host the vmd process representing that guest :> sits at 100% CPU utilization with 1 thread constantly queueing onto a :> cpu and running. The guest reports normal load so it must be one of the :> 2 threads. : :This should fix it. : :Use rtc_reschedule_per in mc146818_start instead of re arming the :periodic interrupt without checking if it's enabled in REGB. : :ok? : :-- :Pratik : :Index: usr.sbin/vmd/mc146818.c :=== :RCS file: /home/pdvyas/cvs/src/usr.sbin/vmd/mc146818.c,v :retrieving revision 1.15 :diff -u -p -a -u -r1.15 mc146818.c :--- usr.sbin/vmd/mc146818.c9 Jul 2017 00:51:40 - 1.15 :+++ usr.sbin/vmd/mc146818.c27 Feb 2018 02:47:18 - :@@ -354,6 +354,6 @@ mc146818_stop() :void :mc146818_start() :{ :- evtimer_add(, _tv); : evtimer_add(, _tv); :+ rtc_reschedule_per(); :} : This helps a lot with the CPU load on a vmd host. Drops my single guest from ~50% CPU to ~9% CPU on the host. OK (btw, should rtc_fireper() receive a similar change?) -- The right half of the brain controls the left half of the body. This means that only left handed people are in their right mind.
Re: VMD consumes 100% cpu after unpausing guest
* Dave Voutila <d...@sisu.io> [2018-02-22 23:40:21 -0500]: Synopsis: VMD consumes 100% cpu after unpausing guest Category: amd64 Environment: System : OpenBSD 6.2 Details : OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 MST 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 Description: Not sure if this is a known issue, but I couldn't find anything searching the lists. Using an Alpine Linux guest vm, I can successfully pause the guest using `vmctl pause 1` and some time later resume it using `vmctl unpause 1`. Unpausing works as the guest comes back to life, I can SSH back in, and it's fine. However, on the host the vmd process representing that guest sits at 100% CPU utilization with 1 thread constantly queueing onto a cpu and running. The guest reports normal load so it must be one of the 2 threads. This should fix it. Use rtc_reschedule_per in mc146818_start instead of re arming the periodic interrupt without checking if it's enabled in REGB. ok? -- Pratik Index: usr.sbin/vmd/mc146818.c === RCS file: /home/pdvyas/cvs/src/usr.sbin/vmd/mc146818.c,v retrieving revision 1.15 diff -u -p -a -u -r1.15 mc146818.c --- usr.sbin/vmd/mc146818.c 9 Jul 2017 00:51:40 - 1.15 +++ usr.sbin/vmd/mc146818.c 27 Feb 2018 02:47:18 - @@ -354,6 +354,6 @@ mc146818_stop() void mc146818_start() { - evtimer_add(, _tv); evtimer_add(, _tv); + rtc_reschedule_per(); }
Re: VMD consumes 100% cpu after unpausing guest
* Dave Voutila <d...@sisu.io> [2018-02-22 23:40:21 -0500]: Synopsis: VMD consumes 100% cpu after unpausing guest Category: amd64 Environment: System : OpenBSD 6.2 Details : OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 MST 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 Description: Not sure if this is a known issue, but I couldn't find anything searching the lists. Using an Alpine Linux guest vm, I can successfully pause the guest using `vmctl pause 1` and some time later resume it using `vmctl unpause 1`. Unpausing works as the guest comes back to life, I can SSH back in, and it's fine. However, on the host the vmd process representing that guest sits at 100% CPU utilization with 1 thread constantly queueing onto a cpu and running. The guest reports normal load so it must be one of the 2 threads. Thanks Dave for the report. I can reproduce this with a receive as well. Probably mc146818_start doesn't do the right thing. Will report back when I find a solution. -- Pratik
VMD consumes 100% cpu after unpausing guest
>Synopsis: VMD consumes 100% cpu after unpausing guest >Category: amd64 >Environment: System : OpenBSD 6.2 Details : OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 MST 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 >Description: Not sure if this is a known issue, but I couldn't find anything searching the lists. Using an Alpine Linux guest vm, I can successfully pause the guest using `vmctl pause 1` and some time later resume it using `vmctl unpause 1`. Unpausing works as the guest comes back to life, I can SSH back in, and it's fine. However, on the host the vmd process representing that guest sits at 100% CPU utilization with 1 thread constantly queueing onto a cpu and running. The guest reports normal load so it must be one of the 2 threads. Taking a ktrace of that particular thread, and slimming for sake of email, it's constantly calling clock_gettime and kevent: CALLfutex(0x7361d183cd0,0x2,1,0,0) RET futex 0 CALLkevent(5,0,0,0x7361d17c800,64,0x735f272b7c0) STRUstruct timespec RET kevent 0 CALLclock_gettime(CLOCK_MONOTONIC,0x735f272b860) STRUstruct timespec RET clock_gettime 0 CALLkevent(5,0,0,0x7361d17c800,64,0x735f272b7c0) STRUstruct timespec RET kevent 0 CALLclock_gettime(CLOCK_MONOTONIC,0x735f272b860) STRUstruct timespec RET clock_gettime 0 CALLkevent(5,0,0,0x7361d17c800,64,0x735f272b7c0) STRUstruct timespec RET kevent 0 CALLclock_gettime(CLOCK_MONOTONIC,0x735f272b860) STRUstruct timespec RET clock_gettime 0 CALLkevent(5,0,0,0x7361d17c800,64,0x735f272b7c0) STRUstruct timespec RET kevent 0 CALLclock_gettime(CLOCK_MONOTONIC,0x735f272b860) STRUstruct timespec RET clock_gettime 0 CALLkevent(5,0,0,0x7361d17c800,64,0x735f272b7c0) STRUstruct timespec RET kevent 0 ...etc. VMD reports nothing strange, which I'd expect as the guest vm is perfectly functional during this period even while that thread burns up the CPU: startup /etc/vm.conf:3: switch "uplink" registered vm_register: registering vm 1 /etc/vm.conf:12: vm "alpine" registered (disabled) vm_priv_brconfig: interface bridge0 description switch1-uplink vmd_configure: not creating vm alpine (disabled) config_setconfig: setting config config_getconfig: retrieving config config_getconfig: retrieving config config_getconfig: retrieving config vm_opentty: vm alpine tty /dev/ttyp5 uid 1000 gid 4 mode 620 vm_register: registering vm 1 vm_priv_ifconfig: interface tap0 description vm1-if0-alpine vm_priv_ifconfig: switch "uplink" interface bridge0 add tap0 alpine: started vm 1 successfully, tty /dev/ttyp5 loadfile_bios: loaded BIOS image run_vm: initializing hardware for vm alpine virtio_init: vm "alpine" vio0 lladdr fe:e1:bb:d1:1b:bd run_vm: starting vcpu threads for vm alpine vcpu_reset: resetting vcpu 0 for vm 3 run_vm: waiting on events for VM alpine i8259_write_datareg: master pic, reset IRQ vector to 0x8 i8259_write_datareg: slave pic, reset IRQ vector to 0x70 vcpu_exit_i8253: channel 0 reset, mode=0, start=65535 virtio_blk_io: device reset virtio_blk_io: device reset vcpu_process_com_lcr: set baudrate = 115200 vcpu_process_com_lcr: set baudrate = 115200 i8259_write_datareg: master pic, reset IRQ vector to 0x30 i8259_write_datareg: slave pic, reset IRQ vector to 0x38 vcpu_process_com_lcr: set baudrate = 115200 vcpu_exit_i8253: channel 0 reset, mode=7, start=3977 vcpu_exit_i8253: channel 2 reset, mode=7, start=65535 vcpu_exit_i8253: channel 2 reset, mode=7, start=65535 vcpu_exit_i8253: channel 2 reset, mode=7, start=65535 vcpu_exit_i8253: channel 2 reset, mode=7, start=65535 vcpu_process_com_lcr: set baudrate = 115200 vcpu_process_com_data: guest reading com1 when not ready vcpu_process_com_data: guest reading com1 when not ready vcpu_process_com_data: guest reading com1 when not ready vcpu_process_com_lcr: set baudrate = 115200 virtio_blk_io: device reset virtio_blk_io: device reset virtio_net_io: device reset alpine: paused vm 1 successfully alpine: unpaused vm 1 successfully. rtc_update_rega: set non-32KHz timebase not supported rtc_fire1: RTC clock drift (44s), requesting guest resync rtc_update_rega: set non-32KHz timebase not supported >How-To-Repeat: Pause an actively running linux guest: `vmctl pause 1` After some time, resume the guest: `vmctl unpause 1` Observe CPU utilization of matching VMD process. >Fix: Unknown. Stopping the guest through either having it halt or `vmctl stop ` obviously ends the cpu consumption. dmesg: OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 MST 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17053851648 (16263MB)