Re: vmm disk unavailable after forceful vm termination

2019-12-07 Thread Johan Huldtgren
On 2019-11-01 12:40, Mike Larkin wrote:
> On Fri, Nov 01, 2019 at 09:20:58AM -0400, Johan Huldtgren wrote:
> > hello,
> > 
> > I have vmd running on -current, in it I have an Ubuntu vm (18.04.3 LTS),
> > every now and then the Ubuntu vm will hang hard, console is dead, only
> > option is to restart it. Now at that point a graceful restart won't
> > work, 'vmctl stop n' will run and return but nothing will happen. So
> > the only option is 'vmctl stop -f n', this will kill the vm. However
> > after that the vm will disapear from the list of vms
> > 
> > Before killing:
> > 
> > $ vmctl status
> >  ID   PID VCPUS  MAXMEM  CURMEM TTYOWNERSTATE NAME
> >   2 71924 12.0G1.1G   ttyp2johan  running plex.vm
> >   1 18308 18.0G6.5G   ttyp0johan  running monitor.vm
> >   3 - 11.0G   -   -johan  stopped 
> > amd64-ports.vm
> >   4 - 1512M   -   -johan  stopped 
> > i386-ports.vm
> > 
> > After killing:
> > 
> > $ vmctl stop -f 2
> > stopping vm: forced to terminate vm 2
> > 
> > $ vmctl status
> >  ID   PID VCPUS  MAXMEM  CURMEM TTYOWNERSTATE NAME
> >   1 18308 18.0G6.5G   ttyp0johan  running monitor.vm
> >   3 - 11.0G   -   -johan  stopped 
> > amd64-ports.vm
> >   4 - 1512M   -   -johan  stopped 
> > i386-ports.vm
> > 
> > The vm is now gone, I can't just start it with 'vmctl start n',
> > but it's defined in vm.conf so let's try starting it by name:
> > 
> > $ vmctl start plex.vm
> > vmctl: start vm command failed: Operation already in progress
> > 
> > ok.. so let's restart vmd.
> > 
> > $ doas rcctl stop vmd
> > vmd(ok)
> > $ doas rcctl start vmd
> > vmd(ok)
> > 
> > but it's not actually running.
> > 
> > $ vmctl status
> > vmctl: connect: /var/run/vmd.sock: Connection refused
> > 
> > Let's try starting manually with debug
> > 
> > $ doas vmd -d
> > startup
> > warning: macro 'sets' not used
> > can't open disk /ftp/vm/plex.img: Resource temporarily unavailable
> > failed to start vm plex.vm
> > parent: configuration failed
> > priv exiting, pid 9509
> > control exiting, pid 63731
> > vmm exiting, pid 28106
> > 
> > So it seems the disk image is now in some state which won't let it be read 
> > again?
> > 
> > $ ls -al /ftp/vm/plex.img
> > -rw---  1 root  wheel  32212254720 Nov  1 08:07 /ftp/vm/plex.img
> > 
> > $ doas file /ftp/vm/plex.img
> > /ftp/vm/plex.img: x86 boot sector; partition 1: ID=0x83, active, starthead 
> > 32, startsector 2048, 62910464 sectors
> > 
> > At this point the only solution is rebooting the host running vmd.
> > After that everything will work just fine again until the next time
> > this happens. I don't know if this is known or a bug, googling and
> > scanning through marc.info I couldn't find any reports. I don't know
> > if this is relevant but when I restarted this vm yesterday (along with
> > hanging hard it sometimes just dies and is reported as stopped). I see
> > this in /var/log/daemon
> > 
> > Oct 31 07:58:25 absu vmd[17613]: plex.vm: started vm 2 successfully, tty 
> > /dev/ttyp2
> > Oct 31 07:58:29 absu vmd[71924]: vcpu_process_com_data: guest reading com1 
> > when not ready
> > Oct 31 07:58:35 absu vmd[71924]: vioblk_notifyq: unsupported command 0x8
> > Oct 31 07:58:52 absu vmd[71924]: vcpu_process_com_data: guest reading com1 
> > when not ready
> > 
> > Full dmesg of the vmd host below, let me know if I can provide any further 
> > details.
> > 
> > thanks,
> > 
> > .jh
> > 
> 
> Sometimes vmd gets really stuck like this and even vmctl stop -f won't stop
> it and rcctl stop vmd also fails. You probably have a vmd spinning at 100%,
> kill that one manually and it should free things up.
> 
> I know about the problem, but have not had a chance to fix it yet.
> 
> -ml

Just for the archives, this happened again and just as you stated there was
a vmd process still spinning at 100% from the forcefully killed vm. killing
it let me just restart the vm again without any of the other steps. 

thanks,

.jh

> 
> > ---
> > 
> > OpenBSD 6.6-current (GENERIC.MP) #407: Mon Oct 28 00:42:58 MDT 2019
> >  dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > real mem = 137318658048 (130957MB)
> > avail mem = 133144268800 (126976MB)
> > mpath0 at root
> > scsibus0 at mpath0: 256 targets
> > mainbus0 at root
> > bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7db4c000 (134 entries)
> > bios0: vendor American Megatrends Inc. version "3703" date 04/24/2018
> > bios0: ASUSTeK COMPUTER INC. Z10PE-D16 Series
> > acpi0 at bios0: ACPI 5.0
> > acpi0: sleep states S0 S4 S5
> > acpi0: tables DSDT FACP APIC FPDT FIDT MCFG EINJ UEFI HPET MSCT SLIT SRAT 
> > WDDT SSDT SPMI SSDT SSDT PRAD DMAR HEST BERT ERST
> > acpi0: wakeup devices IP2P(S3) EHC1(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) 
> > BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) 

Re: vmm disk unavailable after forceful vm termination

2019-11-01 Thread Mike Larkin
On Fri, Nov 01, 2019 at 09:20:58AM -0400, Johan Huldtgren wrote:
> hello,
> 
> I have vmd running on -current, in it I have an Ubuntu vm (18.04.3 LTS),
> every now and then the Ubuntu vm will hang hard, console is dead, only
> option is to restart it. Now at that point a graceful restart won't
> work, 'vmctl stop n' will run and return but nothing will happen. So
> the only option is 'vmctl stop -f n', this will kill the vm. However
> after that the vm will disapear from the list of vms
> 
> Before killing:
> 
> $ vmctl status
>  ID   PID VCPUS  MAXMEM  CURMEM TTYOWNERSTATE NAME
>   2 71924 12.0G1.1G   ttyp2johan  running plex.vm
>   1 18308 18.0G6.5G   ttyp0johan  running monitor.vm
>   3 - 11.0G   -   -johan  stopped 
> amd64-ports.vm
>   4 - 1512M   -   -johan  stopped 
> i386-ports.vm
> 
> After killing:
> 
> $ vmctl stop -f 2
> stopping vm: forced to terminate vm 2
> 
> $ vmctl status
>  ID   PID VCPUS  MAXMEM  CURMEM TTYOWNERSTATE NAME
>   1 18308 18.0G6.5G   ttyp0johan  running monitor.vm
>   3 - 11.0G   -   -johan  stopped 
> amd64-ports.vm
>   4 - 1512M   -   -johan  stopped 
> i386-ports.vm
> 
> The vm is now gone, I can't just start it with 'vmctl start n',
> but it's defined in vm.conf so let's try starting it by name:
> 
> $ vmctl start plex.vm
> vmctl: start vm command failed: Operation already in progress
> 
> ok.. so let's restart vmd.
> 
> $ doas rcctl stop vmd
> vmd(ok)
> $ doas rcctl start vmd
> vmd(ok)
> 
> but it's not actually running.
> 
> $ vmctl status
> vmctl: connect: /var/run/vmd.sock: Connection refused
> 
> Let's try starting manually with debug
> 
> $ doas vmd -d
> startup
> warning: macro 'sets' not used
> can't open disk /ftp/vm/plex.img: Resource temporarily unavailable
> failed to start vm plex.vm
> parent: configuration failed
> priv exiting, pid 9509
> control exiting, pid 63731
> vmm exiting, pid 28106
> 
> So it seems the disk image is now in some state which won't let it be read 
> again?
> 
> $ ls -al /ftp/vm/plex.img
> -rw---  1 root  wheel  32212254720 Nov  1 08:07 /ftp/vm/plex.img
> 
> $ doas file /ftp/vm/plex.img
> /ftp/vm/plex.img: x86 boot sector; partition 1: ID=0x83, active, starthead 
> 32, startsector 2048, 62910464 sectors
> 
> At this point the only solution is rebooting the host running vmd.
> After that everything will work just fine again until the next time
> this happens. I don't know if this is known or a bug, googling and
> scanning through marc.info I couldn't find any reports. I don't know
> if this is relevant but when I restarted this vm yesterday (along with
> hanging hard it sometimes just dies and is reported as stopped). I see
> this in /var/log/daemon
> 
> Oct 31 07:58:25 absu vmd[17613]: plex.vm: started vm 2 successfully, tty 
> /dev/ttyp2
> Oct 31 07:58:29 absu vmd[71924]: vcpu_process_com_data: guest reading com1 
> when not ready
> Oct 31 07:58:35 absu vmd[71924]: vioblk_notifyq: unsupported command 0x8
> Oct 31 07:58:52 absu vmd[71924]: vcpu_process_com_data: guest reading com1 
> when not ready
> 
> Full dmesg of the vmd host below, let me know if I can provide any further 
> details.
> 
> thanks,
> 
> .jh
> 

Sometimes vmd gets really stuck like this and even vmctl stop -f won't stop
it and rcctl stop vmd also fails. You probably have a vmd spinning at 100%,
kill that one manually and it should free things up.

I know about the problem, but have not had a chance to fix it yet.

-ml

> ---
> 
> OpenBSD 6.6-current (GENERIC.MP) #407: Mon Oct 28 00:42:58 MDT 2019
>  dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 137318658048 (130957MB)
> avail mem = 133144268800 (126976MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7db4c000 (134 entries)
> bios0: vendor American Megatrends Inc. version "3703" date 04/24/2018
> bios0: ASUSTeK COMPUTER INC. Z10PE-D16 Series
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S4 S5
> acpi0: tables DSDT FACP APIC FPDT FIDT MCFG EINJ UEFI HPET MSCT SLIT SRAT 
> WDDT SSDT SPMI SSDT SSDT PRAD DMAR HEST BERT ERST
> acpi0: wakeup devices IP2P(S3) EHC1(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) 
> BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) RP01(S4) RP02(S4) 
> RP03(S4) RP04(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2394.84 MHz, 06-3f-02
> cpu0: 
> 

vmm disk unavailable after forceful vm termination

2019-11-01 Thread Johan Huldtgren
hello,

I have vmd running on -current, in it I have an Ubuntu vm (18.04.3 LTS),
every now and then the Ubuntu vm will hang hard, console is dead, only
option is to restart it. Now at that point a graceful restart won't
work, 'vmctl stop n' will run and return but nothing will happen. So
the only option is 'vmctl stop -f n', this will kill the vm. However
after that the vm will disapear from the list of vms

Before killing:

$ vmctl status
 ID   PID VCPUS  MAXMEM  CURMEM TTYOWNERSTATE NAME
  2 71924 12.0G1.1G   ttyp2johan  running plex.vm
  1 18308 18.0G6.5G   ttyp0johan  running monitor.vm
  3 - 11.0G   -   -johan  stopped amd64-ports.vm
  4 - 1512M   -   -johan  stopped i386-ports.vm

After killing:

$ vmctl stop -f 2
stopping vm: forced to terminate vm 2

$ vmctl status
 ID   PID VCPUS  MAXMEM  CURMEM TTYOWNERSTATE NAME
  1 18308 18.0G6.5G   ttyp0johan  running monitor.vm
  3 - 11.0G   -   -johan  stopped amd64-ports.vm
  4 - 1512M   -   -johan  stopped i386-ports.vm

The vm is now gone, I can't just start it with 'vmctl start n',
but it's defined in vm.conf so let's try starting it by name:

$ vmctl start plex.vm
vmctl: start vm command failed: Operation already in progress

ok.. so let's restart vmd.

$ doas rcctl stop vmd
vmd(ok)
$ doas rcctl start vmd
vmd(ok)

but it's not actually running.

$ vmctl status
vmctl: connect: /var/run/vmd.sock: Connection refused

Let's try starting manually with debug

$ doas vmd -d
startup
warning: macro 'sets' not used
can't open disk /ftp/vm/plex.img: Resource temporarily unavailable
failed to start vm plex.vm
parent: configuration failed
priv exiting, pid 9509
control exiting, pid 63731
vmm exiting, pid 28106

So it seems the disk image is now in some state which won't let it be read 
again?

$ ls -al /ftp/vm/plex.img
-rw---  1 root  wheel  32212254720 Nov  1 08:07 /ftp/vm/plex.img

$ doas file /ftp/vm/plex.img
/ftp/vm/plex.img: x86 boot sector; partition 1: ID=0x83, active, starthead 32, 
startsector 2048, 62910464 sectors

At this point the only solution is rebooting the host running vmd.
After that everything will work just fine again until the next time
this happens. I don't know if this is known or a bug, googling and
scanning through marc.info I couldn't find any reports. I don't know
if this is relevant but when I restarted this vm yesterday (along with
hanging hard it sometimes just dies and is reported as stopped). I see
this in /var/log/daemon

Oct 31 07:58:25 absu vmd[17613]: plex.vm: started vm 2 successfully, tty 
/dev/ttyp2
Oct 31 07:58:29 absu vmd[71924]: vcpu_process_com_data: guest reading com1 when 
not ready
Oct 31 07:58:35 absu vmd[71924]: vioblk_notifyq: unsupported command 0x8
Oct 31 07:58:52 absu vmd[71924]: vcpu_process_com_data: guest reading com1 when 
not ready

Full dmesg of the vmd host below, let me know if I can provide any further 
details.

thanks,

.jh

---

OpenBSD 6.6-current (GENERIC.MP) #407: Mon Oct 28 00:42:58 MDT 2019
 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 137318658048 (130957MB)
avail mem = 133144268800 (126976MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7db4c000 (134 entries)
bios0: vendor American Megatrends Inc. version "3703" date 04/24/2018
bios0: ASUSTeK COMPUTER INC. Z10PE-D16 Series
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT MCFG EINJ UEFI HPET MSCT SLIT SRAT WDDT 
SSDT SPMI SSDT SSDT PRAD DMAR HEST BERT ERST
acpi0: wakeup devices IP2P(S3) EHC1(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) 
BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) RP01(S4) RP02(S4) 
RP03(S4) RP04(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2394.84 MHz, 06-3f-02
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2394.47 MHz, 06-3f-02
cpu1: