All my vmds are stuck after syzkaller was running for some 5 days (and
found a couple of bugs!). I'll reinstall the system to get a fresh baseline.

> 1. Are you getting any vmd cores?
>  * sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
>  * kill all vmd and restart the test. Not sure if that sysctl can be done
>    after boot or if it needs to be in /etc/sysctl.conf

After applying the fixes for the two bugs that y'all promptly fixed I'm no
longer getting any core dumps.

> 2. building a host kernel with VMM_DEBUG will be helpful here to see why
VMs
>   are disappearing.

Done. Still running the kernel from a few days back.

> 3. You're running this nested in GCP? Do you know what VMX features they
>   expose to guests? Perhaps there is an assumption being made in vmm that
>   we have a certain feature not being exposed by the underlying GCP
hypervisor
>   (although I'm pretty sure that's not the case, might be good to check -
>   VMM_DEBUG will tell us this).


ci-openbsd$ ps ax | grep vmd
39771 ??  Ssp     0:06.67 vmd: vmm (vmd)
 7674 ??  Is      0:00.25 vmd: priv (vmd)
44564 ??  Ssp     0:22.78 vmd: control (vmd)
38905 ??  Ssp     0:13.43 /usr/sbin/vmd
88755 ??  Rp/3  4610:25.87 vmd: ci-openbsd-main-0 (vmd)
 9636 ??  Rp/1  4360:16.00 vmd: ci-openbsd-main-2 (vmd)
59918 ??  Rp/1  3559:18.43 vmd: ci-openbsd-main-1 (vmd)

The VMs are unpinagable:

ci-openbsd$ netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags   Refs      Use   Mtu  Prio
Iface
default            10.128.0.1         UGS        8    12002     -     8
vio0
224/4              127.0.0.1          URS        0        0 32768     8
lo0
10.128.0.1         42:01:0a:80:00:01  UHLch      1     8300     -     7
vio0
10.128.0.1/32      10.128.0.63        UCS        1        0     -     8
vio0
10.128.0.63        42:01:0a:80:00:3f  UHLl       0     9050     -     1
vio0
10.128.0.63/32     10.128.0.63        UCn        0        0     -     4
vio0
100.65.69.2/31     100.65.69.2        UCn        0        3     -     4
tap2
100.65.69.2        fe:e1:ba:d7:6a:76  UHLl       0        0     -     1
tap2
100.65.104.2/31    100.65.104.2       UCn        0        3     -     4
tap0
100.65.104.2       fe:e1:ba:d8:6a:68  UHLl       0        1     -     1
tap0
100.65.212.2/31    100.65.212.2       UCn        0        3     -     4
tap1
100.65.212.2       fe:e1:ba:da:00:a0  UHLl       0        1     -     1
tap1
127/8              127.0.0.1          UGRS       0        0 32768     8
lo0
127.0.0.1          127.0.0.1          UHhl       1       47 32768     1
lo0

ci-openbsd$ ping 100.65.69.3
PING 100.65.69.3 (100.65.69.3): 56 data bytes
^C
--- 100.65.69.3 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
ci-openbsd$ ping 100.65.104.3
PING 100.65.104.3 (100.65.104.3): 56 data bytes
^C
--- 100.65.104.3 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
ci-openbsd$ ping 100.65.212.3
PING 100.65.212.3 (100.65.212.3): 56 data bytes
^C
--- 100.65.212.3 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
ci-openbsd$

I tried to attach to them but I don't think the results are very
satisfactory:

(gdb) attach 59918
Attaching to program: /usr/sbin/vmd, process 59918
ptrace: No such process.
(gdb) attach 88755
Attaching to program: /usr/sbin/vmd, process 88755
Couldn't get registers: Device busy.
Couldn't get registers: Device busy.
(gdb) Reading symbols from /usr/lib/libutil.so.13.0...done.
Reading symbols from /usr/lib/libevent.so.4.1...done.
Reading symbols from /usr/lib/libc.so.92.5...done.
Reading symbols from /usr/libexec/ld.so...done.
[New thread 376076]
[New thread 433606]
[Switching to thread 152280]
0x0000197245b834f5 in event_queue_insert (base=0x19719ae76400,
ev=<optimized out>, queue=8) at /usr/src/lib/libevent/event.c:954
954     /usr/src/lib/libevent/event.c: No such file or directory.
where
#0  0x0000197245b834f5 in event_queue_insert (base=0x19719ae76400,
ev=<optimized out>, queue=8) at /usr/src/lib/libevent/event.c:954
#1  event_active (ev=<optimized out>, res=1, ncalls=1) at
/usr/src/lib/libevent/event.c:806
#2  timeout_process (base=<optimized out>) at
/usr/src/lib/libevent/event.c:900
#3  event_base_loop (base=0x19719ae76400, flags=0) at
/usr/src/lib/libevent/event.c:499
#4  0x0000196f8940eeaf in ?? ()
#5  0x0000197277b6cdce in _rthread_start (v=0x19719ae76400) at
/usr/src/lib/librthread/rthread.c:96
#6  0x000019726e03db0b in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
#7  0x0000000000000000 in ?? ()
(gdb) info threads
  Id   Target Id         Frame
* 1    thread 152280     0x0000197245b834f5 in event_queue_insert
(base=0x19719ae76400, ev=<optimized out>, queue=8) at
/usr/src/lib/libevent/event.c:954
  2    thread 376076     futex () at -:3
  3    thread 433606     futex () at -:3
(gdb) thread 2
[Switching to thread 2 (thread 376076)]
#0  futex () at -:3
3       -: No such file or directory.
(gdb) bt
#0  futex () at -:3
#1  0x000019726e08fed5 in _rthread_cond_timedwait (cond=0x19719ae752c0,
mutexp=0x196f896b3ed0, abs=0x0) at /usr/src/lib/libc/thread/synch.h:41
#2  0x0000196f8940d89c in ?? ()
#3  0x0000196f8940caff in ?? ()
#4  0x0000196f8940c09e in ?? ()
#5  0x0000196f8940b7eb in ?? ()
#6  0x0000196f89408b2f in ?? ()
#7  0x0000197245b8364d in event_process_active (base=<optimized out>) at
/usr/src/lib/libevent/event.c:350
#8  event_base_loop (base=0x19719ae72000, flags=0) at
/usr/src/lib/libevent/event.c:502
#9  0x0000196f89409538 in ?? ()
#10 0x0000196f8940850b in ?? ()
#11 0x0000196f89403a1d in ?? ()
#12 0x0000196f89400d86 in ?? ()
#13 0x0000000000000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (thread 433606)]
#0  futex () at -:3
3       in -
(gdb) bt
#0  futex () at -:3
#1  0x000019726e08fed5 in _rthread_cond_timedwait (cond=0x19719ae74520,
mutexp=0x196f896b3ee0, abs=0x0) at /usr/src/lib/libc/thread/synch.h:41
#2  0x0000196f8940ec18 in ?? ()
#3  0x0000197277b6cdce in _rthread_start (v=0x53) at
/usr/src/lib/librthread/rthread.c:96
#4  0x000019726e03db0b in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
#5  0x0000000000000000 in ?? ()
(gdb)

Here's the tail of /var/log/messages:

Oct  8 04:30:00 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-1
Oct  8 04:30:00 ci-openbsd /bsd: vmm_handle_cpuid: unsupported
rax=0x40000100
Oct  8 04:30:01 ci-openbsd /bsd: vmm_handle_cpuid: function 0x06
(thermal/power mgt) not supported
Oct  8 04:30:01 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 04:30:01 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 04:30:01 ci-openbsd /bsd: vmm_handle_cpuid: invalid cpuid input leaf
0x15, guest rip=0xffffffff817c0285 - resetting to 0xd
Oct  8 04:30:01 ci-openbsd /bsd: vmx_handle_wrmsr: wrmsr exit, msr=0x277,
discarding data written from guest=0x70106:0x70106
Oct  8 04:30:10 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 04:31:16 ci-openbsd vmd[31058]: ci-openbsd-main-test-0:
vcpu_assert_pic_irq: can't assert INTR
Oct  8 09:00:01 ci-openbsd syslogd[14719]: restart
Oct  8 10:30:13 ci-openbsd /bsd: vm_impl_init_vmx: created vm_map @
0xffff800000b58100
Oct  8 10:30:13 ci-openbsd /bsd: vm_resetcpu: resetting vm 436 vcpu 0 to
power on defaults
Oct  8 10:30:13 ci-openbsd /bsd: Guest EPTP = 0x3c06cf01e
Oct  8 10:30:13 ci-openbsd /bsd: vmm_handle_cpuid: function 0x07 (SEFF)
unsupported subleaf 0x6c65746e not supported
Oct  8 10:30:13 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 10:30:13 ci-openbsd /bsd: vmx_handle_cr: mov to cr0 @ 100060a,
data=0xe0010031
Oct  8 10:30:14 ci-openbsd /bsd: vmm_handle_cpuid: unsupported
rax=0x40000100
Oct  8 10:30:14 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 10:30:14 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 10:30:14 ci-openbsd /bsd: vmm_handle_cpuid: function 0x06
(thermal/power mgt) not supported
Oct  8 10:30:14 ci-openbsd /bsd: vmm_handle_cpuid: invalid cpuid input leaf
0x15, guest rip=0xffffffff81183975 - resetting to 0xd
Oct  8 10:30:14 ci-openbsd /bsd: vmx_handle_wrmsr: wrmsr exit, msr=0x277,
discarding data written from guest=0x70106:0x70106
Oct  8 10:30:14 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: user 1000
cpu limit reached
Oct  8 10:30:14 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-0
Oct  8 10:30:23 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 10:31:39 ci-openbsd vmd[27761]: ci-openbsd-main-test-1:
vcpu_assert_pic_irq: can't assert INTR
Oct  8 14:30:17 ci-openbsd /bsd: vm_impl_init_vmx: created vm_map @
0xffff800000b76200
Oct  8 14:30:18 ci-openbsd /bsd: vm_resetcpu: resetting vm 437 vcpu 0 to
power on defaults
Oct  8 14:30:18 ci-openbsd /bsd: Guest EPTP = 0x43e9be01e
Oct  8 14:30:18 ci-openbsd /bsd: vmm_handle_cpuid: function 0x07 (SEFF)
unsupported subleaf 0x6c65746e not supported
Oct  8 14:30:18 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 14:30:18 ci-openbsd /bsd: vmx_handle_cr: mov to cr0 @ 100060a,
data=0xe0010031
Oct  8 14:30:19 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: user 1000
cpu limit reached
Oct  8 14:30:19 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-0
Oct  8 14:30:19 ci-openbsd /bsd: vmm_handle_cpuid: unsupported
rax=0x40000100
Oct  8 14:30:20 ci-openbsd /bsd: vmm_handle_cpuid: function 0x06
(thermal/power mgt) not supported
Oct  8 14:30:20 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 14:30:20 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 14:30:20 ci-openbsd /bsd: vmm_handle_cpuid: invalid cpuid input leaf
0x15, guest rip=0xffffffff819bf425 - resetting to 0xd
Oct  8 14:30:20 ci-openbsd /bsd: vmx_handle_wrmsr: wrmsr exit, msr=0x277,
discarding data written from guest=0x70106:0x70106
Oct  8 14:30:35 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 14:31:33 ci-openbsd vmd[61255]: ci-openbsd-main-test-1:
vcpu_assert_pic_irq: can't assert INTR
Oct  8 22:30:13 ci-openbsd /bsd: vm_impl_init_vmx: created vm_map @
0xffff800000b6e500
Oct  8 22:30:13 ci-openbsd /bsd: vm_resetcpu: resetting vm 438 vcpu 0 to
power on defaults
Oct  8 22:30:13 ci-openbsd /bsd: Guest EPTP = 0x3bef6b01e
Oct  8 22:30:13 ci-openbsd /bsd: vmm_handle_cpuid: function 0x07 (SEFF)
unsupported subleaf 0x6c65746e not supported
Oct  8 22:30:13 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 22:30:13 ci-openbsd /bsd: vmx_handle_cr: mov to cr0 @ 100060a,
data=0xe0010031
Oct  8 22:30:14 ci-openbsd /bsd: vmm_handle_cpuid: unsupported
rax=0x40000100
Oct  8 22:30:14 ci-openbsd vmd[38905]: ci-openbsd-main-test-1: user 1000
cpu limit reached
Oct  8 22:30:14 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-1
Oct  8 22:30:14 ci-openbsd /bsd: vmm_handle_cpuid: function 0x06
(thermal/power mgt) not supported
Oct  8 22:30:14 ci-openbsd /bsd: vmm_handle_cpuid: invalid cpuid input leaf
0x15, guest rip=0xffffffff8190b9f5 - resetting to 0xd
Oct  8 22:30:14 ci-openbsd /bsd: vmx_handle_wrmsr: wrmsr exit, msr=0x277,
discarding data written from guest=0x70106:0x70106
Oct  8 22:30:15 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 22:30:15 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 22:30:23 ci-openbsd /bsd: vmm_handle_cpuid: function 0x0a (arch.
perf mon) not supported
Oct  8 22:31:37 ci-openbsd vmd[45252]: ci-openbsd-main-test-0:
vcpu_assert_pic_irq: can't assert INTR

And the same period of /var/log/daemon:

Oct  8 00:34:47 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: started vm
487 successfully, tty /dev/ttyp3
Oct  8 00:34:48 ci-openbsd vmd[38905]: ci-openbsd-main-test-1: user 1000
cpu limit reached
Oct  8 00:34:48 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-1
Oct  8 00:34:49 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 00:34:49 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 00:36:08 ci-openbsd vmd[97254]: ci-openbsd-main-test-0: can't set
INTR: No such file or directory
Oct  8 04:29:59 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: started vm
490 successfully, tty /dev/ttyp3
Oct  8 04:30:00 ci-openbsd vmd[38905]: ci-openbsd-main-test-1: user 1000
cpu limit reached
Oct  8 04:30:00 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-1
Oct  8 04:30:01 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 04:30:01 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 04:31:16 ci-openbsd vmd[31058]: ci-openbsd-main-test-0:
vcpu_assert_pic_irq: can't assert INTR
Oct  8 10:30:13 ci-openbsd vmd[38905]: ci-openbsd-main-test-1: started vm
493 successfully, tty /dev/ttyp3
Oct  8 10:30:14 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 10:30:14 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 10:30:14 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: user 1000
cpu limit reached
Oct  8 10:30:14 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-0
Oct  8 10:31:39 ci-openbsd vmd[27761]: ci-openbsd-main-test-1:
vcpu_assert_pic_irq: can't assert INTR
Oct  8 14:30:18 ci-openbsd vmd[38905]: ci-openbsd-main-test-1: started vm
496 successfully, tty /dev/ttyp3
Oct  8 14:30:19 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: user 1000
cpu limit reached
Oct  8 14:30:19 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-0
Oct  8 14:30:20 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 14:30:20 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 14:31:33 ci-openbsd vmd[61255]: ci-openbsd-main-test-1:
vcpu_assert_pic_irq: can't assert INTR
Oct  8 22:30:13 ci-openbsd vmd[38905]: ci-openbsd-main-test-0: started vm
499 successfully, tty /dev/ttyp3
Oct  8 22:30:14 ci-openbsd vmd[38905]: ci-openbsd-main-test-1: user 1000
cpu limit reached
Oct  8 22:30:14 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-1
Oct  8 22:30:15 ci-openbsd vmd[38905]: ci-openbsd-main-test-2: user 1000
cpu limit reached
Oct  8 22:30:15 ci-openbsd vmd[38905]: config_setvm: failed to start vm
ci-openbsd-main-test-2
Oct  8 22:31:37 ci-openbsd vmd[45252]: ci-openbsd-main-test-0:
vcpu_assert_pic_irq: can't assert INTR


-- 
nest.cx is Gmail hosted, use PGP for anything private. Key:
http://goo.gl/6dMsr
Fingerprint: 5E2B 2D0E 1E03 2046 BEC3  4D50 0B15 42BD 8DF5 A1B0

Reply via email to