Re: vmm disk unavailable after forceful vm termination
On 2019-11-01 12:40, Mike Larkin wrote: > On Fri, Nov 01, 2019 at 09:20:58AM -0400, Johan Huldtgren wrote: > > hello, > > > > I have vmd running on -current, in it I have an Ubuntu vm (18.04.3 LTS), > > every now and then the Ubuntu vm will hang hard, console is dead, only > > option is to restart it. Now at that point a graceful restart won't > > work, 'vmctl stop n' will run and return but nothing will happen. So > > the only option is 'vmctl stop -f n', this will kill the vm. However > > after that the vm will disapear from the list of vms > > > > Before killing: > > > > $ vmctl status > > ID PID VCPUS MAXMEM CURMEM TTYOWNERSTATE NAME > > 2 71924 12.0G1.1G ttyp2johan running plex.vm > > 1 18308 18.0G6.5G ttyp0johan running monitor.vm > > 3 - 11.0G - -johan stopped > > amd64-ports.vm > > 4 - 1512M - -johan stopped > > i386-ports.vm > > > > After killing: > > > > $ vmctl stop -f 2 > > stopping vm: forced to terminate vm 2 > > > > $ vmctl status > > ID PID VCPUS MAXMEM CURMEM TTYOWNERSTATE NAME > > 1 18308 18.0G6.5G ttyp0johan running monitor.vm > > 3 - 11.0G - -johan stopped > > amd64-ports.vm > > 4 - 1512M - -johan stopped > > i386-ports.vm > > > > The vm is now gone, I can't just start it with 'vmctl start n', > > but it's defined in vm.conf so let's try starting it by name: > > > > $ vmctl start plex.vm > > vmctl: start vm command failed: Operation already in progress > > > > ok.. so let's restart vmd. > > > > $ doas rcctl stop vmd > > vmd(ok) > > $ doas rcctl start vmd > > vmd(ok) > > > > but it's not actually running. > > > > $ vmctl status > > vmctl: connect: /var/run/vmd.sock: Connection refused > > > > Let's try starting manually with debug > > > > $ doas vmd -d > > startup > > warning: macro 'sets' not used > > can't open disk /ftp/vm/plex.img: Resource temporarily unavailable > > failed to start vm plex.vm > > parent: configuration failed > > priv exiting, pid 9509 > > control exiting, pid 63731 > > vmm exiting, pid 28106 > > > > So it seems the disk image is now in some state which won't let it be read > > again? > > > > $ ls -al /ftp/vm/plex.img > > -rw--- 1 root wheel 32212254720 Nov 1 08:07 /ftp/vm/plex.img > > > > $ doas file /ftp/vm/plex.img > > /ftp/vm/plex.img: x86 boot sector; partition 1: ID=0x83, active, starthead > > 32, startsector 2048, 62910464 sectors > > > > At this point the only solution is rebooting the host running vmd. > > After that everything will work just fine again until the next time > > this happens. I don't know if this is known or a bug, googling and > > scanning through marc.info I couldn't find any reports. I don't know > > if this is relevant but when I restarted this vm yesterday (along with > > hanging hard it sometimes just dies and is reported as stopped). I see > > this in /var/log/daemon > > > > Oct 31 07:58:25 absu vmd[17613]: plex.vm: started vm 2 successfully, tty > > /dev/ttyp2 > > Oct 31 07:58:29 absu vmd[71924]: vcpu_process_com_data: guest reading com1 > > when not ready > > Oct 31 07:58:35 absu vmd[71924]: vioblk_notifyq: unsupported command 0x8 > > Oct 31 07:58:52 absu vmd[71924]: vcpu_process_com_data: guest reading com1 > > when not ready > > > > Full dmesg of the vmd host below, let me know if I can provide any further > > details. > > > > thanks, > > > > .jh > > > > Sometimes vmd gets really stuck like this and even vmctl stop -f won't stop > it and rcctl stop vmd also fails. You probably have a vmd spinning at 100%, > kill that one manually and it should free things up. > > I know about the problem, but have not had a chance to fix it yet. > > -ml Just for the archives, this happened again and just as you stated there was a vmd process still spinning at 100% from the forcefully killed vm. killing it let me just restart the vm again without any of the other steps. thanks, .jh > > > --- > > > > OpenBSD 6.6-current (GENERIC.MP) #407: Mon Oct 28 00:42:58 MDT 2019 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > real mem = 137318658048 (130957MB) > > avail mem = 133144268800 (126976MB) > > mpath0 at root > > scsibus0 at mpath0: 256 targets > > mainbus0 at root > > bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7db4c000 (134 entries) > > bios0: vendor American Megatrends Inc. version "3703" date 04/24/2018 > > bios0: ASUSTeK COMPUTER INC. Z10PE-D16 Series > > acpi0 at bios0: ACPI 5.0 > > acpi0: sleep states S0 S4 S5 > > acpi0: tables DSDT FACP APIC FPDT FIDT MCFG EINJ UEFI HPET MSCT SLIT SRAT > > WDDT SSDT SPMI SSDT SSDT PRAD DMAR HEST BERT ERST > > acpi0: wakeup devices IP2P(S3) EHC1(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) > > BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4)
Re: vmm disk unavailable after forceful vm termination
On Fri, Nov 01, 2019 at 09:20:58AM -0400, Johan Huldtgren wrote: > hello, > > I have vmd running on -current, in it I have an Ubuntu vm (18.04.3 LTS), > every now and then the Ubuntu vm will hang hard, console is dead, only > option is to restart it. Now at that point a graceful restart won't > work, 'vmctl stop n' will run and return but nothing will happen. So > the only option is 'vmctl stop -f n', this will kill the vm. However > after that the vm will disapear from the list of vms > > Before killing: > > $ vmctl status > ID PID VCPUS MAXMEM CURMEM TTYOWNERSTATE NAME > 2 71924 12.0G1.1G ttyp2johan running plex.vm > 1 18308 18.0G6.5G ttyp0johan running monitor.vm > 3 - 11.0G - -johan stopped > amd64-ports.vm > 4 - 1512M - -johan stopped > i386-ports.vm > > After killing: > > $ vmctl stop -f 2 > stopping vm: forced to terminate vm 2 > > $ vmctl status > ID PID VCPUS MAXMEM CURMEM TTYOWNERSTATE NAME > 1 18308 18.0G6.5G ttyp0johan running monitor.vm > 3 - 11.0G - -johan stopped > amd64-ports.vm > 4 - 1512M - -johan stopped > i386-ports.vm > > The vm is now gone, I can't just start it with 'vmctl start n', > but it's defined in vm.conf so let's try starting it by name: > > $ vmctl start plex.vm > vmctl: start vm command failed: Operation already in progress > > ok.. so let's restart vmd. > > $ doas rcctl stop vmd > vmd(ok) > $ doas rcctl start vmd > vmd(ok) > > but it's not actually running. > > $ vmctl status > vmctl: connect: /var/run/vmd.sock: Connection refused > > Let's try starting manually with debug > > $ doas vmd -d > startup > warning: macro 'sets' not used > can't open disk /ftp/vm/plex.img: Resource temporarily unavailable > failed to start vm plex.vm > parent: configuration failed > priv exiting, pid 9509 > control exiting, pid 63731 > vmm exiting, pid 28106 > > So it seems the disk image is now in some state which won't let it be read > again? > > $ ls -al /ftp/vm/plex.img > -rw--- 1 root wheel 32212254720 Nov 1 08:07 /ftp/vm/plex.img > > $ doas file /ftp/vm/plex.img > /ftp/vm/plex.img: x86 boot sector; partition 1: ID=0x83, active, starthead > 32, startsector 2048, 62910464 sectors > > At this point the only solution is rebooting the host running vmd. > After that everything will work just fine again until the next time > this happens. I don't know if this is known or a bug, googling and > scanning through marc.info I couldn't find any reports. I don't know > if this is relevant but when I restarted this vm yesterday (along with > hanging hard it sometimes just dies and is reported as stopped). I see > this in /var/log/daemon > > Oct 31 07:58:25 absu vmd[17613]: plex.vm: started vm 2 successfully, tty > /dev/ttyp2 > Oct 31 07:58:29 absu vmd[71924]: vcpu_process_com_data: guest reading com1 > when not ready > Oct 31 07:58:35 absu vmd[71924]: vioblk_notifyq: unsupported command 0x8 > Oct 31 07:58:52 absu vmd[71924]: vcpu_process_com_data: guest reading com1 > when not ready > > Full dmesg of the vmd host below, let me know if I can provide any further > details. > > thanks, > > .jh > Sometimes vmd gets really stuck like this and even vmctl stop -f won't stop it and rcctl stop vmd also fails. You probably have a vmd spinning at 100%, kill that one manually and it should free things up. I know about the problem, but have not had a chance to fix it yet. -ml > --- > > OpenBSD 6.6-current (GENERIC.MP) #407: Mon Oct 28 00:42:58 MDT 2019 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 137318658048 (130957MB) > avail mem = 133144268800 (126976MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7db4c000 (134 entries) > bios0: vendor American Megatrends Inc. version "3703" date 04/24/2018 > bios0: ASUSTeK COMPUTER INC. Z10PE-D16 Series > acpi0 at bios0: ACPI 5.0 > acpi0: sleep states S0 S4 S5 > acpi0: tables DSDT FACP APIC FPDT FIDT MCFG EINJ UEFI HPET MSCT SLIT SRAT > WDDT SSDT SPMI SSDT SSDT PRAD DMAR HEST BERT ERST > acpi0: wakeup devices IP2P(S3) EHC1(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) > BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) RP01(S4) RP02(S4) > RP03(S4) RP04(S4) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2394.84 MHz, 06-3f-02 > cpu0: >
vmm disk unavailable after forceful vm termination
hello, I have vmd running on -current, in it I have an Ubuntu vm (18.04.3 LTS), every now and then the Ubuntu vm will hang hard, console is dead, only option is to restart it. Now at that point a graceful restart won't work, 'vmctl stop n' will run and return but nothing will happen. So the only option is 'vmctl stop -f n', this will kill the vm. However after that the vm will disapear from the list of vms Before killing: $ vmctl status ID PID VCPUS MAXMEM CURMEM TTYOWNERSTATE NAME 2 71924 12.0G1.1G ttyp2johan running plex.vm 1 18308 18.0G6.5G ttyp0johan running monitor.vm 3 - 11.0G - -johan stopped amd64-ports.vm 4 - 1512M - -johan stopped i386-ports.vm After killing: $ vmctl stop -f 2 stopping vm: forced to terminate vm 2 $ vmctl status ID PID VCPUS MAXMEM CURMEM TTYOWNERSTATE NAME 1 18308 18.0G6.5G ttyp0johan running monitor.vm 3 - 11.0G - -johan stopped amd64-ports.vm 4 - 1512M - -johan stopped i386-ports.vm The vm is now gone, I can't just start it with 'vmctl start n', but it's defined in vm.conf so let's try starting it by name: $ vmctl start plex.vm vmctl: start vm command failed: Operation already in progress ok.. so let's restart vmd. $ doas rcctl stop vmd vmd(ok) $ doas rcctl start vmd vmd(ok) but it's not actually running. $ vmctl status vmctl: connect: /var/run/vmd.sock: Connection refused Let's try starting manually with debug $ doas vmd -d startup warning: macro 'sets' not used can't open disk /ftp/vm/plex.img: Resource temporarily unavailable failed to start vm plex.vm parent: configuration failed priv exiting, pid 9509 control exiting, pid 63731 vmm exiting, pid 28106 So it seems the disk image is now in some state which won't let it be read again? $ ls -al /ftp/vm/plex.img -rw--- 1 root wheel 32212254720 Nov 1 08:07 /ftp/vm/plex.img $ doas file /ftp/vm/plex.img /ftp/vm/plex.img: x86 boot sector; partition 1: ID=0x83, active, starthead 32, startsector 2048, 62910464 sectors At this point the only solution is rebooting the host running vmd. After that everything will work just fine again until the next time this happens. I don't know if this is known or a bug, googling and scanning through marc.info I couldn't find any reports. I don't know if this is relevant but when I restarted this vm yesterday (along with hanging hard it sometimes just dies and is reported as stopped). I see this in /var/log/daemon Oct 31 07:58:25 absu vmd[17613]: plex.vm: started vm 2 successfully, tty /dev/ttyp2 Oct 31 07:58:29 absu vmd[71924]: vcpu_process_com_data: guest reading com1 when not ready Oct 31 07:58:35 absu vmd[71924]: vioblk_notifyq: unsupported command 0x8 Oct 31 07:58:52 absu vmd[71924]: vcpu_process_com_data: guest reading com1 when not ready Full dmesg of the vmd host below, let me know if I can provide any further details. thanks, .jh --- OpenBSD 6.6-current (GENERIC.MP) #407: Mon Oct 28 00:42:58 MDT 2019 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 137318658048 (130957MB) avail mem = 133144268800 (126976MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7db4c000 (134 entries) bios0: vendor American Megatrends Inc. version "3703" date 04/24/2018 bios0: ASUSTeK COMPUTER INC. Z10PE-D16 Series acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S4 S5 acpi0: tables DSDT FACP APIC FPDT FIDT MCFG EINJ UEFI HPET MSCT SLIT SRAT WDDT SSDT SPMI SSDT SSDT PRAD DMAR HEST BERT ERST acpi0: wakeup devices IP2P(S3) EHC1(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) RP01(S4) RP02(S4) RP03(S4) RP04(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2394.84 MHz, 06-3f-02 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2394.47 MHz, 06-3f-02 cpu1: