Re: Error messages with VMM on 6.6 and 6.7

2020-06-02 Thread Mike Larkin
On Tue, Jun 02, 2020 at 10:18:32AM +0800, jrmu wrote:
> OpenBSD VMM suffers from error messages and possibly spontaneous crashing
> 
>   System  : OpenBSD 6.7
>   Details : OpenBSD 6.7 (GENERIC.MP) #182: Thu May  7 11:11:58 MDT 
> 2020
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> 
> >Description:
>   I ran VMM on OpenBSD 6.6 with ~30 VMs, a mixture of OpenBSD 6.6, 6.7, 
> and Debian, and kept seeing the following error messages in logs:
> 
> May 28 00:54:37 srv1 vmd[97924]: rtc_update_rega: set non-32KHz timebase not 
> supported
> May 28 00:59:05 srv1 vmd[24983]: rtc_update_rega: set non-32KHz timebase not 
> supported
> May 28 01:12:35 srv1 vmd[31276]: rtc_update_rega: set non-32KHz timebase not 
> supported
> May 28 01:14:40 srv1 vmd[31276]: vioblk queue notify - nothing to do?
> May 28 01:15:12 srv1 last message repeated 806 times
> May 28 01:17:03 srv1 last message repeated 78 times
> May 28 01:30:03 srv1 vmd[31276]: vioblk queue notify - nothing to do?
> May 28 01:40:19 srv1 last message repeated 67 times
> May 28 01:44:17 srv1 last message repeated 47 times

Those messages are a bit odd, basically it means the in-guest driver notified
vioblk (the VM's disk device) that there were descriptors in the ring but when
the device-side implementation (in vmd(8)) went to process them, the ring was
empty.

There shouldn't be any side effect, although it does indicate something
unexpected is happening.


> May 28 01:44:19 srv1 vmd[9684]: rtc_update_rega: set non-32KHz timebase not 
> supported
> 

Those messages aren't serious, Linux kernels program the RTC this way. TBH,
that message has probably outlived its usefulness. Someone can remove it at
this point.

> Every 2-3 weeks, the system appeared to crash, but I could not find any other 
> error message that would narrow down the cause. I am not sure if the crash is 
> related to either of those two above error messages.

Likely unrelated; those messages are from vmd(8), a user-mode process. I think
it's difficult for vmd(8) to crash the system.

> 
> Today I upgraded to OpenBSD 6.7 stable with hopes that the problem may have 
> been fixed. However, I still notice the same two error messages:
> 
> May 31 19:06:32 srv1 vmd[72705]: vcpu_process_com_data: guest reading com1 
> when not ready
> May 31 19:06:33 srv1 last message repeated 2 times
> May 31 19:06:40 srv1 reorder_kernel: kernel relinking done
> May 31 19:09:03 srv1 vmd[72705]: rtc_update_rega: set non-32KHz timebase not 
> supported
> 
> Any workaround or suggestions?

What is the question here? If you are tiring of the log messages, you can remove
those particular ones. As I said higher up, these messages have likely exceeded
their usefulness (these were put in during early development to detect weird
corner cases like this).

-ml

> 
> dmesg:
> OpenBSD 6.7 (GENERIC.MP) #182: Thu May  7 11:11:58 MDT 2020
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 34306437120 (32717MB)
> avail mem = 33254100992 (31713MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec830 (156 entries)
> bios0: vendor American Megatrends Inc. version "3.3" date 05/23/2018
> bios0: Supermicro X9DRi-LN4+/X9DR3-LN4+
> acpi0 at bios0: ACPI 4.0
> acpi0: sleep states S0 S1 S4 S5
> acpi0: tables DSDT FACP APIC FPDT SRAT SLIT HPET PRAD SPMI SSDT EINJ ERST 
> HEST BERT DMAR MCFG
> acpi0: wakeup devices P0P9(S1) EUSB(S4) USBE(S4) PEX0(S4) PWVE(S4) NPE1(S4) 
> NPE4(S4) NPE5(S4) NPE6(S4) NPE8(S4) NPEA(S4) NPE2(S4) NPE3(S4) NPE7(S4) 
> NPE9(S4) NPE2(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz, 2000.27 MHz, 06-2d-07
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz, 2000.02 MHz, 06-2d-07
> cpu1: 
> 

Error messages with VMM on 6.6 and 6.7

2020-06-01 Thread jrmu
OpenBSD VMM suffers from error messages and possibly spontaneous crashing

System  : OpenBSD 6.7
Details : OpenBSD 6.7 (GENERIC.MP) #182: Thu May  7 11:11:58 MDT 
2020
 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64

>Description:
I ran VMM on OpenBSD 6.6 with ~30 VMs, a mixture of OpenBSD 6.6, 6.7, 
and Debian, and kept seeing the following error messages in logs:

May 28 00:54:37 srv1 vmd[97924]: rtc_update_rega: set non-32KHz timebase not 
supported
May 28 00:59:05 srv1 vmd[24983]: rtc_update_rega: set non-32KHz timebase not 
supported
May 28 01:12:35 srv1 vmd[31276]: rtc_update_rega: set non-32KHz timebase not 
supported
May 28 01:14:40 srv1 vmd[31276]: vioblk queue notify - nothing to do?
May 28 01:15:12 srv1 last message repeated 806 times
May 28 01:17:03 srv1 last message repeated 78 times
May 28 01:30:03 srv1 vmd[31276]: vioblk queue notify - nothing to do?
May 28 01:40:19 srv1 last message repeated 67 times
May 28 01:44:17 srv1 last message repeated 47 times
May 28 01:44:19 srv1 vmd[9684]: rtc_update_rega: set non-32KHz timebase not 
supported

Every 2-3 weeks, the system appeared to crash, but I could not find any other 
error message that would narrow down the cause. I am not sure if the crash is 
related to either of those two above error messages.

Today I upgraded to OpenBSD 6.7 stable with hopes that the problem may have 
been fixed. However, I still notice the same two error messages:

May 31 19:06:32 srv1 vmd[72705]: vcpu_process_com_data: guest reading com1 when 
not ready
May 31 19:06:33 srv1 last message repeated 2 times
May 31 19:06:40 srv1 reorder_kernel: kernel relinking done
May 31 19:09:03 srv1 vmd[72705]: rtc_update_rega: set non-32KHz timebase not 
supported

Any workaround or suggestions?

dmesg:
OpenBSD 6.7 (GENERIC.MP) #182: Thu May  7 11:11:58 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 34306437120 (32717MB)
avail mem = 33254100992 (31713MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec830 (156 entries)
bios0: vendor American Megatrends Inc. version "3.3" date 05/23/2018
bios0: Supermicro X9DRi-LN4+/X9DR3-LN4+
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP APIC FPDT SRAT SLIT HPET PRAD SPMI SSDT EINJ ERST HEST 
BERT DMAR MCFG
acpi0: wakeup devices P0P9(S1) EUSB(S4) USBE(S4) PEX0(S4) PWVE(S4) NPE1(S4) 
NPE4(S4) NPE5(S4) NPE6(S4) NPE8(S4) NPEA(S4) NPE2(S4) NPE3(S4) NPE7(S4) 
NPE9(S4) NPE2(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz, 2000.27 MHz, 06-2d-07
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz, 2000.02 MHz, 06-2d-07
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz, 2000.02 MHz, 06-2d-07
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz, 2000.01 MHz, 06-2d-07
cpu3: