Re: vmd losing VMs

2018-10-23 Thread Greg Steuck
Not that I expected anything to change considering the patches submitted,
but Oct 21 snapshot (minus "Add support to create and convert disk images
from existing images") is similarly afflicted. Any candidate fixes or
patches for added logging will be put to test in short order :)

ci-openbsd$ dmesg | head
OpenBSD 6.4-current (GENERIC.MP) #376: Sun Oct 21 22:46:20 MDT 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17163079680 (16367MB)
avail mem = 16633651200 (15863MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries)
bios0: vendor Google version "Google" date 01/01/2011
bios0: Google Google Compute Engine
ci-openbsd$ ps ax | grep vm
 75798 ??  Is  0:00.04 vmd: priv (vmd)
70138 ??  Isp 0:02.35 /usr/sbin/vmd
49408 ??  Isp 0:01.25 vmd: vmm (vmd)
87301 ??  Isp 0:04.63 vmd: control (vmd)
83798 ??  Rp/2  874:45.69 vmd: ci-openbsd-main-0 (vmd)
74819 ??  Rp/0  469:43.70 vmd: ci-openbsd-main-1 (vmd)
52785 ??  Rp/38:04.52 vmd: ci-openbsd-main-2 (vmd)
58963 ??  Ip  0:00.02 vmctl stop ci-openbsd-main-0 -f -w
25424 ??  Ip  0:00.02 vmctl stop ci-openbsd-main-1 -f -w
58043 p3  S+p 0:00.01 grep vm


Re: vmd losing VMs

2018-10-17 Thread Dmitry Vyukov
On Wed, Oct 17, 2018 at 9:40 PM, Greg Steuck  wrote:
> On Wed, Oct 17, 2018 at 12:00 PM Dmitry Vyukov  wrote:
>>
>> On Wed, Oct 17, 2018 at 4:50 AM, Greg Steuck  wrote:
>> > I think I see some evidence of occasional VM just never finishing
>> > booting to
>> > the point of running sshd. At least that's what I surmised from the
>> > posted
>> > manager.log file.
>> > https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632
>>
>>
>> This should be fine in the sense that it should not lead to vmd losing
>> VMs.
>
>
> I agree that that vmd shouldn't be losing the VMs in such a case. But some
> evidence shows they become unkillable as described in my previous message:
> https://marc.info/?l=openbsd-tech=153955188302856=2


But VMM does not have an idea about guest OS, OS booting, etc. For VMM
it's just a CPU that runs instruction some stream. There may be no OS
at all. So I don't see how this can affect vmd behavior. If killing a
VM leads to hangs, then most likely that can equally happen if the
kill happens during boot or not during boot.


>> If test machine does not come up live within some time frame, we
>> kill it and create a new one.
>
>
> How tight a timeout does it have?


10 mins to print IP, and then +20 mins for sshd to come up:
https://github.com/google/syzkaller/blob/master/vm/vmm/vmm.go#L215



Re: vmd losing VMs

2018-10-17 Thread Greg Steuck
> Can you net out what the current issues you're facing? I can't seem to
grok
> the log file posted above (not sure what I'm looking at). From reviewing
the
> thread, it seems the core dumps are gone but there may (?) be some issues
> still?

The core dumps are gone indeed. Thanks for the prompt fixes!

I'm using the Oct 11 snapshot.

The issue that's still vexing is VMs sticking in limbo. There's a vmd
process spinning at 100% cpu. vmctl status doesn't show the VM. vmctl stop
hangs waiting for a response from vmd. I don't think there's enough logging
in vmd. At least I don't see anything revealing in /var/log/daemon despite
running with -vv. Pasted the data overlapping with
https://marc.info/?l=openbsd-tech=153955188302856=2 as
https://gist.github.com/blackgnezdo/ebd728246abe418b1867df1361c95e27.

ci-openbsd$ ps ax | grep vm
34144 ??  Is  0:00.39 vmd: priv (vmd)
63860 ??  Isp 0:04.99 /usr/sbin/vmd -vv
 4223 ??  Isp 0:06.20 vmd: vmm (vmd)
71772 ??  Isp 0:09.01 vmd: control (vmd)
72373 ??  Rp/3  1136:32.53 vmd: ci-openbsd-main-1 (vmd)
90783 ??  Rp/3   50:24.59 vmd: ci-openbsd-main-2 (vmd)
47967 ??  Rp/0   49:57.49 vmd: ci-openbsd-main-0 (vmd)
55129 ??  Ip  0:00.02 vmctl stop ci-openbsd-main-1 -f -w

I could work around this in a classic sysadmin fashion by writing a script
which greps for any long running vmctl stop processes, then kill the
corresponding vmd. I think this works, but I also suspect that internal vmd
accounting which imposes the 4 VMs limit will soon get in the way.

BTW, we could use a higher VM count limit. I don't want it badly enough to
implement config passing code or run the VMs as root, hi Reyk :)

The log was from syz-manager, so it was mostly for Dmitry's benefit.

Thanks
Greg


Re: vmd losing VMs

2018-10-17 Thread Greg Steuck
On Wed, Oct 17, 2018 at 12:00 PM Dmitry Vyukov  wrote:

> On Wed, Oct 17, 2018 at 4:50 AM, Greg Steuck  wrote:
> > I think I see some evidence of occasional VM just never finishing
> booting to
> > the point of running sshd. At least that's what I surmised from the
> posted
> > manager.log file.
> > https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632
>
>
> This should be fine in the sense that it should not lead to vmd losing VMs.


I agree that that vmd shouldn't be losing the VMs in such a case. But some
evidence shows they become unkillable as described in my previous message:
https://marc.info/?l=openbsd-tech=153955188302856=2


> If test machine does not come up live within some time frame, we
> kill it and create a new one.


How tight a timeout does it have?

Thanks
Greg


Re: vmd losing VMs

2018-10-17 Thread Dmitry Vyukov
On Wed, Oct 17, 2018 at 4:50 AM, Greg Steuck  wrote:
> I think I see some evidence of occasional VM just never finishing booting to
> the point of running sshd. At least that's what I surmised from the posted
> manager.log file.
> https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632


This should be fine in the sense that it should not lead to vmd losing
VMs. If test machine does not come up live within some time frame, we
kill it and create a new one.


> Here's an excerpt:
> 2018/10/16 19:35:08 VMs 2, executed 68322, cover 18582, crashes 78, repro 0
> 2018/10/16 19:35:13 failed to create instance: can't ssh into the instance:
> failed to run ["ssh" "-p" "22" "-i" "/syzkaller/managers/main/current/key"
> "-F" "/dev/null" "-o" "UserKnownHostsFile=/dev/null" "-o" "BatchMode=yes"
> "-o" "IdentitiesOnly=yes" "-o" "StrictHostKeyChecking=no" "-o"
> "ConnectTimeout=10" "-i" "/syzkaller/managers/main/current/key"
> "root@100.66.10.3" "pwd"]: exit status 255
> ssh: connect to host 100.66.10.3 port 22: Operation timed out
>
> [ using 1975784 bytes of bsd ELF symbol table ]
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2018 OpenBSD. All rights reserved.
> https://www.OpenBSD.org
> OpenBSD 6.4-current (SYZKALLER) #73: Tue Oct 16 10:17:58 PDT 2018
>
> root@ci-openbsd.syzkaller:/syzkaller/managers/main/kernel/sys/arch/amd64/compile/SYZKALLER
> real mem = 520093696 (496MB)
> avail mem = 493764608 (470MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0
> acpi at bios0 not configured
> cpu0 at mainbus0: (uniprocessor)
> cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2335.02 MHz, 06-3f-00
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> pvbus0 at mainbus0: OpenBSD
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
> viornd0 at virtio0
> virtio0: irq 3
> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00
> vioblk0 at virtio1
> scsibus1 at vioblk0: 2 targets
> sd0 at scsibus1 targ 0 lun 0:  SCSI3 0/direct fixed
> sd0: 1024MB, 512 bytes/sector, 2097152 sectors
> virtio1: irq 5
> virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
> vio0 at virtio2: address fe:e1:bb:d1:d0:3f
> virtio2: irq 6
> virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00
> vmmci0 at virtio3
> virtio3: irq 7
> isa0 at mainbus0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo
> com0: console
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> root on sd0a (e12e74b46a21f42a.a) swap on sd0b dump on sd0b
> Automatic boot in progress: starting file system checks.
> /dev/sd0a (e12e74b46a21f42a.a): file system is clean; not checking
> setting tty flags
> starting network
> vio0: bound to 100.66.10.3 from 100.66.10.2 (fe:e1:bb:d1:d0:40)
> starting early daemons: syslogd.
> starting RPC daemons:.
> savecore: no core dump
> acpidump: Can't find ACPI information
> checking quotas: done.
> clearing /tmp
> kern.securelevel: 0 -> 1
> creating runtime link editor directory cache.
> preserving editor files.
> 2018/10/16 19:35:18 VMs 2, executed 68355, cover 18582, crashes 78, repro 0
> 2018/10/16 19:35:18 hub sync: send: add 1, del 0, repros 0; recv: progs 0,
> repros 0; more 0
> 2018/10/16 19:35:28 VMs 2, executed 68406, cover 18582, crashes 78, repro 0
> 2018/10/16 19:35:38 VMs 2, executed 68451, cover 18582, crashes 78, repro 0
> 2018/10/16 19:35:48 VMs 2, executed 68479, cover 18582, crashes 78, repro 0
> 2018/10/16 19:35:58 VMs 2, executed 68501, cover 18582, crashes 78, repro 0
> 2018/10/16 19:36:08 VMs 2, executed 68569, cover 18582, crashes 78, repro 0
> 2018/10/16 19:36:16 failed to create instance: vmm exited
> vmctl: start vm command failed: Too many processes
>
>
>
> On Wed, Oct 10, 2018 at 3:06 AM Dmitry Vyukov  wrote:
>>
>> On Tue, Oct 2, 2018 at 8:02 PM, Greg Steuck  wrote:
>> > Dmitry, is there an easy way get at the VMs output?
>>
>>
>> You mean th

Re: vmd losing VMs

2018-10-16 Thread Greg Steuck
I think I see some evidence of occasional VM just never finishing booting
to the point of running sshd. At least that's what I surmised from the
posted manager.log file.
https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632

Here's an excerpt:
2018/10/16 19:35:08 VMs 2, executed 68322, cover 18582, crashes 78, repro 0
2018/10/16 19:35:13 failed to create instance: can't ssh into the instance:
failed to run ["ssh" "-p" "22" "-i" "/syzkaller/managers/main/current/key"
"-F" "/dev/null" "-o" "UserKnownHostsFile=/dev/null" "-o" "BatchMode=yes"
"-o" "IdentitiesOnly=yes" "-o" "StrictHostKeyChecking=no" "-o"
"ConnectTimeout=10" "-i" "/syzkaller/managers/main/current/key" "
root@100.66.10.3" "pwd"]: exit status 255
ssh: connect to host 100.66.10.3 port 22: Operation timed out

[ using 1975784 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2018 OpenBSD. All rights reserved.
https://www.OpenBSD.org
OpenBSD 6.4-current (SYZKALLER) #73: Tue Oct 16 10:17:58 PDT 2018
root@ci-openbsd.syzkaller
:/syzkaller/managers/main/kernel/sys/arch/amd64/compile/SYZKALLER
real mem = 520093696 (496MB)
avail mem = 493764608 (470MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0
acpi at bios0 not configured
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2335.02 MHz, 06-3f-00
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
pvbus0 at mainbus0: OpenBSD
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
viornd0 at virtio0
virtio0: irq 3
virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00
vioblk0 at virtio1
scsibus1 at vioblk0: 2 targets
sd0 at scsibus1 targ 0 lun 0:  SCSI3 0/direct fixed
sd0: 1024MB, 512 bytes/sector, 2097152 sectors
virtio1: irq 5
virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
vio0 at virtio2: address fe:e1:bb:d1:d0:3f
virtio2: irq 6
virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00
vmmci0 at virtio3
virtio3: irq 7
isa0 at mainbus0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo
com0: console
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd0a (e12e74b46a21f42a.a) swap on sd0b dump on sd0b
Automatic boot in progress: starting file system checks.
/dev/sd0a (e12e74b46a21f42a.a): file system is clean; not checking
setting tty flags
starting network
vio0: bound to 100.66.10.3 from 100.66.10.2 (fe:e1:bb:d1:d0:40)
starting early daemons: syslogd.
starting RPC daemons:.
savecore: no core dump
acpidump: Can't find ACPI information
checking quotas: done.
clearing /tmp
kern.securelevel: 0 -> 1
creating runtime link editor directory cache.
preserving editor files.
2018/10/16 19:35:18 VMs 2, executed 68355, cover 18582, crashes 78, repro 0
2018/10/16 19:35:18 hub sync: send: add 1, del 0, repros 0; recv: progs 0,
repros 0; more 0
2018/10/16 19:35:28 VMs 2, executed 68406, cover 18582, crashes 78, repro 0
2018/10/16 19:35:38 VMs 2, executed 68451, cover 18582, crashes 78, repro 0
2018/10/16 19:35:48 VMs 2, executed 68479, cover 18582, crashes 78, repro 0
2018/10/16 19:35:58 VMs 2, executed 68501, cover 18582, crashes 78, repro 0
2018/10/16 19:36:08 VMs 2, executed 68569, cover 18582, crashes 78, repro 0
2018/10/16 19:36:16 failed to create instance: vmm exited
vmctl: start vm command failed: Too many processes



On Wed, Oct 10, 2018 at 3:06 AM Dmitry Vyukov  wrote:

> On Tue, Oct 2, 2018 at 8:02 PM, Greg Steuck  wrote:
> > Dmitry, is there an easy way get at the VMs output?
>
>
> You mean the test VM instances (vmd), right?
>
> We capture vmm/kernel output for crashes, you can see it in the Log
> columns for each crash here:
> https://syzkaller.appspot.com/#openbsd
>
> Also if syz-manager is started with -debug flag, then it dumps
> everything into terminal in real time, including vmm/kernel output.
> This is intended for debugging of reliable mis-behaviors (the thing is
> not working at all).
>
>
>
>
> > Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for
> RT3290
> > chipset by James Hastings."). No lockup thus far.
> >
> > OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct  2 10:37:13 PDT 2018
> >
> > syzkaller@ci-openbsd.syzkaller
> :/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG
> > real mem = 17163079680 (16367MB)
> > avail mem = 16633610240 (15863MB)
> > mpath0 at root
> > scsibus0 at mpath0: 256 targets
> > mainbus0 at root
> > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries)
> > bios0: vendor Google version "Google" date 01/01/2011
> > 

Re: vmd losing VMs

2018-10-14 Thread Greg Steuck
Now that I'm running OpenBSD 6.4 (GENERIC.MP) #362: Thu Oct 11 04:53:41 MDT
2018, I can start debugging again. I just observed an interesting tidbit
which I failed to notice before. Namely, there are also hanging vmctl
processes trying to stop those spinning VMs. So, I tried to reproduce this
myself. The first attempt shows, that vmd is somewhat aware of the VM
presence (even though it doesn't report it in vmctl status).

ci-openbsd$ vmctl status
   ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
1 - 1512M   -   -syzkaller syzkaller

ci-openbsd$ ./obj/vmctl stop ci-openbsd-main-2x -f -w
stopping vm ci-openbsd-main-2x: vm not found

^^^ - Here, a random VM name is refused. OTOH, when trying to stop a
previously known (and currently spinning) VM, it causes a hang in imsg_read.

ci-openbsd$ gdb -q -- /syzkaller/src/usr.sbin/vmctl/obj/vmctl
(gdb) run stop ci-openbsd-main-2 -f -w
Starting program: /syzkaller/src/usr.sbin/vmctl/obj/vmctl stop
ci-openbsd-main-2 -f -w
stopping vm ci-openbsd-main-2: ^C
Current language:  auto; currently asm
(gdb) where
#0  _thread_sys_recvmsg () at -:3
#1  0x1c25828acd6e in _libc_recvmsg_cancel (fd=Variable "fd" is not
available.
) at /usr/src/lib/libc/sys/w_recvmsg.c:27
#2  0x1c2520999521 in imsg_read (ibuf=0x1c24fc4ed000) at
/usr/src/lib/libutil/imsg.c:82
#3  0x1c22f490392c in vmmaction (res=Variable "res" is not available.
) at /syzkaller/src/usr.sbin/vmctl/main.c:273
#4  0x1c22f4902fd2 in ctl_stop (res=0x7f7e6f80, argc=Variable
"argc" is not available.

) at /syzkaller/src/usr.sbin/vmctl/main.c:793
#5  0x1c22f490351e in parse (argc=4, argv=Variable "argv" is not
available.
) at /syzkaller/src/usr.sbin/vmctl/main.c:172
#6  0x1c22f49033be in main (argc=4, argv=Variable "argv" is not
available.
) at /syzkaller/src/usr.sbin/vmctl/main.c:134
(gdb)

i-openbsd$ uname -a
OpenBSD ci-openbsd.syzkaller 6.4 GENERIC.MP#362 amd64
ci-openbsd$ dmesg | head
OpenBSD 6.4 (GENERIC.MP) #362: Thu Oct 11 04:53:41 MDT 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17163079680 (16367MB)
avail mem = 16633643008 (15863MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries)
bios0: vendor Google version "Google" date 01/01/2011
bios0: Google Google Compute Engine
ci-openbsd$ uname -a
ci-openbsd$ ps ax | grep vm
55596 ??  Ssp 0:04.86 vmd: vmm (vmd)
22978 ??  Is  0:00.22 vmd: priv (vmd)
52555 ??  Ssp 0:13.01 vmd: control (vmd)
17471 ??  Ssp 0:06.15 /usr/sbin/vmd
29044 ??  Rp/0  2197:50.09 vmd: ci-openbsd-main-2 (vmd)
52266 ??  Rp/1  257:11.58 vmd: ci-openbsd-main-1 (vmd)
15989 ??  Rp/1  241:18.45 vmd: ci-openbsd-main-0 (vmd)
19071 ??  Ip  0:00.02 vmctl stop ci-openbsd-main-1 -f -w
88222 ??  Ip  0:00.02 vmctl stop ci-openbsd-main-0 -f -w
 6142 ??  Sp  0:00.01 vmctl stop ci-openbsd-main-2 -f -w
ci-openbsd$ dmesg |tail
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation)
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd0a (11584d676adca97e.a) swap on sd0b dump on sd0b
ci-openbsd$

ct 13 19:38:41 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm
330 successfully, tty /dev/ttyp0
Oct 13 20:16:56 ci-openbsd vmd[29583]: ci-openbsd-main-1:
vcpu_assert_pic_irq: can't assert INTR
Oct 13 20:17:00 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm
331 successfully, tty /dev/ttyp0
Oct 13 20:22:04 ci-openbsd vmd[31844]: vcpu_run_loop: vm 320 / vcpu 0
run ioctl failed: No such file or directory
Oct 13 20:22:07 ci-openbsd vmd[17471]: ci-openbsd-main-0: started vm
332 successfully, tty /dev/ttyp2
Oct 13 21:12:18 ci-openbsd vmd[57830]: ci-openbsd-main-1: can't clear
INTR: No such file or directory
Oct 13 21:12:22 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm
333 successfully, tty /dev/ttyp0
Oct 13 21:23:39 ci-openbsd vmd[81046]: ci-openbsd-main-0: can't clear
INTR: No such file or directory
Oct 13 21:23:42 ci-openbsd vmd[17471]: ci-openbsd-main-0: started vm
334 successfully, tty /dev/ttyp2
Oct 13 21:43:42 ci-openbsd vmd[59472]: ci-openbsd-main-0: can't clear
INTR: No such file or directory
Oct 13 21:43:47 ci-openbsd vmd[17471]: ci-openbsd-main-0: started vm
335 successfully, tty /dev/ttyp2
Oct 13 21:52:36 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm
336 successfully, tty /dev/ttyp0
Oct 13 22:06:47 ci-openbsd vmd[58824]: ci-openbsd-main-1:
vcpu_assert_pic_irq: can't assert INTR
Oct 13 22:06:51 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm
337 successfully, tty /dev/ttyp0
Oct 13 22:13:31 ci-openbsd vmd[6946]: ci-openbsd-main-1:
vcpu_assert_pic_irq: can't assert INTR
Oct 13 22:13:35 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm
338 successfully, tty /dev/ttyp0
Oct 13 22:45:14 ci-openbsd vmd[45351]: ci-openbsd-main-0:
vcpu_assert_pic_irq: 

Re: vmd losing VMs

2018-10-10 Thread Dmitry Vyukov
On Tue, Oct 2, 2018 at 8:02 PM, Greg Steuck  wrote:
> Dmitry, is there an easy way get at the VMs output?


You mean the test VM instances (vmd), right?

We capture vmm/kernel output for crashes, you can see it in the Log
columns for each crash here:
https://syzkaller.appspot.com/#openbsd

Also if syz-manager is started with -debug flag, then it dumps
everything into terminal in real time, including vmm/kernel output.
This is intended for debugging of reliable mis-behaviors (the thing is
not working at all).




> Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for RT3290
> chipset by James Hastings."). No lockup thus far.
>
> OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct  2 10:37:13 PDT 2018
>
> syzkaller@ci-openbsd.syzkaller:/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG
> real mem = 17163079680 (16367MB)
> avail mem = 16633610240 (15863MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries)
> bios0: vendor Google version "Google" date 01/01/2011
> bios0: Google Google Compute Engine
> acpi0 at bios0: rev 0
> acpi0: sleep states S3 S4 S5
> acpi0: tables DSDT FACP SSDT APIC WAET SRAT
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2300.55 MHz, 06-3f-00
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 990MHz
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.87 MHz, 06-3f-00
> cpu2:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.89 MHz, 06-3f-00
> cpu3:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 0, core 3, package 0
> cpu4 at mainbus0: apid 1 (application processor)
> cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
> cpu4:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu4: 256KB 64b/line 8-way L2 cache
> cpu4: smt 1, core 0, package 0
> cpu5 at mainbus0: apid 3 (application processor)
> cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
> cpu5:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu5: 256KB 64b/line 8-way L2 cache
> cpu5: smt 1, core 1, package 0
> cpu6 at mainbus0: apid 5 (application processor)
> cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.94 MHz, 06-3f-00
> cpu6:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu6: 256KB 64b/line 8-way L2 cache
> cpu6: smt 1, core 2, package 0
> cpu7 at mainbus0: apid 7 (application processor)
> cpu7: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.91 MHz, 06-3f-00
> cpu7:
> 

Re: vmd losing VMs

2018-10-09 Thread Greg Steuck
All my vmds are stuck after syzkaller was running for some 5 days (and
found a couple of bugs!). I'll reinstall the system to get a fresh baseline.

> 1. Are you getting any vmd cores?
>  * sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
>  * kill all vmd and restart the test. Not sure if that sysctl can be done
>after boot or if it needs to be in /etc/sysctl.conf

After applying the fixes for the two bugs that y'all promptly fixed I'm no
longer getting any core dumps.

> 2. building a host kernel with VMM_DEBUG will be helpful here to see why
VMs
>   are disappearing.

Done. Still running the kernel from a few days back.

> 3. You're running this nested in GCP? Do you know what VMX features they
>   expose to guests? Perhaps there is an assumption being made in vmm that
>   we have a certain feature not being exposed by the underlying GCP
hypervisor
>   (although I'm pretty sure that's not the case, might be good to check -
>   VMM_DEBUG will tell us this).


ci-openbsd$ ps ax | grep vmd
39771 ??  Ssp 0:06.67 vmd: vmm (vmd)
 7674 ??  Is  0:00.25 vmd: priv (vmd)
44564 ??  Ssp 0:22.78 vmd: control (vmd)
38905 ??  Ssp 0:13.43 /usr/sbin/vmd
88755 ??  Rp/3  4610:25.87 vmd: ci-openbsd-main-0 (vmd)
 9636 ??  Rp/1  4360:16.00 vmd: ci-openbsd-main-2 (vmd)
59918 ??  Rp/1  3559:18.43 vmd: ci-openbsd-main-1 (vmd)

The VMs are unpinagable:

ci-openbsd$ netstat -rn
Routing tables

Internet:
DestinationGatewayFlags   Refs  Use   Mtu  Prio
Iface
default10.128.0.1 UGS812002 - 8
vio0
224/4  127.0.0.1  URS00 32768 8
lo0
10.128.0.1 42:01:0a:80:00:01  UHLch  1 8300 - 7
vio0
10.128.0.1/32  10.128.0.63UCS10 - 8
vio0
10.128.0.6342:01:0a:80:00:3f  UHLl   0 9050 - 1
vio0
10.128.0.63/32 10.128.0.63UCn00 - 4
vio0
100.65.69.2/31 100.65.69.2UCn03 - 4
tap2
100.65.69.2fe:e1:ba:d7:6a:76  UHLl   00 - 1
tap2
100.65.104.2/31100.65.104.2   UCn03 - 4
tap0
100.65.104.2   fe:e1:ba:d8:6a:68  UHLl   01 - 1
tap0
100.65.212.2/31100.65.212.2   UCn03 - 4
tap1
100.65.212.2   fe:e1:ba:da:00:a0  UHLl   01 - 1
tap1
127/8  127.0.0.1  UGRS   00 32768 8
lo0
127.0.0.1  127.0.0.1  UHhl   1   47 32768 1
lo0

ci-openbsd$ ping 100.65.69.3
PING 100.65.69.3 (100.65.69.3): 56 data bytes
^C
--- 100.65.69.3 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
ci-openbsd$ ping 100.65.104.3
PING 100.65.104.3 (100.65.104.3): 56 data bytes
^C
--- 100.65.104.3 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
ci-openbsd$ ping 100.65.212.3
PING 100.65.212.3 (100.65.212.3): 56 data bytes
^C
--- 100.65.212.3 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
ci-openbsd$

I tried to attach to them but I don't think the results are very
satisfactory:

(gdb) attach 59918
Attaching to program: /usr/sbin/vmd, process 59918
ptrace: No such process.
(gdb) attach 88755
Attaching to program: /usr/sbin/vmd, process 88755
Couldn't get registers: Device busy.
Couldn't get registers: Device busy.
(gdb) Reading symbols from /usr/lib/libutil.so.13.0...done.
Reading symbols from /usr/lib/libevent.so.4.1...done.
Reading symbols from /usr/lib/libc.so.92.5...done.
Reading symbols from /usr/libexec/ld.so...done.
[New thread 376076]
[New thread 433606]
[Switching to thread 152280]
0x197245b834f5 in event_queue_insert (base=0x19719ae76400,
ev=, queue=8) at /usr/src/lib/libevent/event.c:954
954 /usr/src/lib/libevent/event.c: No such file or directory.
where
#0  0x197245b834f5 in event_queue_insert (base=0x19719ae76400,
ev=, queue=8) at /usr/src/lib/libevent/event.c:954
#1  event_active (ev=, res=1, ncalls=1) at
/usr/src/lib/libevent/event.c:806
#2  timeout_process (base=) at
/usr/src/lib/libevent/event.c:900
#3  event_base_loop (base=0x19719ae76400, flags=0) at
/usr/src/lib/libevent/event.c:499
#4  0x196f8940eeaf in ?? ()
#5  0x197277b6cdce in _rthread_start (v=0x19719ae76400) at
/usr/src/lib/librthread/rthread.c:96
#6  0x19726e03db0b in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
#7  0x in ?? ()
(gdb) info threads
  Id   Target Id Frame
* 1thread 152280 0x197245b834f5 in event_queue_insert
(base=0x19719ae76400, ev=, queue=8) at
/usr/src/lib/libevent/event.c:954
  2thread 376076 futex () at -:3
  3thread 433606 futex () at -:3
(gdb) thread 2
[Switching to thread 2 (thread 376076)]
#0  futex () at -:3
3   -: No such file or directory.
(gdb) bt
#0  futex () at -:3
#1  0x19726e08fed5 in _rthread_cond_timedwait 

Re: vmd losing VMs

2018-10-02 Thread Mike Larkin
On Tue, Oct 02, 2018 at 11:02:57AM -0700, Greg Steuck wrote:
> Dmitry, is there an easy way get at the VMs output?
> 
> Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for
> RT3290 chipset by James Hastings."). No lockup thus far.
> 

Yeah the output below shows normal behaviour.

Probably a similar issue to the one reported a while back, but for that
case, vmd just went away (not spinning).

-ml

> OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct  2 10:37:13 PDT 2018
> syzkaller@ci-openbsd.syzkaller
> :/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG
> real mem = 17163079680 (16367MB)
> avail mem = 16633610240 (15863MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries)
> bios0: vendor Google version "Google" date 01/01/2011
> bios0: Google Google Compute Engine
> acpi0 at bios0: rev 0
> acpi0: sleep states S3 S4 S5
> acpi0: tables DSDT FACP SSDT APIC WAET SRAT
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2300.55 MHz, 06-3f-00
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 990MHz
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.87 MHz, 06-3f-00
> cpu2:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.89 MHz, 06-3f-00
> cpu3:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 0, core 3, package 0
> cpu4 at mainbus0: apid 1 (application processor)
> cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
> cpu4:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu4: 256KB 64b/line 8-way L2 cache
> cpu4: smt 1, core 0, package 0
> cpu5 at mainbus0: apid 3 (application processor)
> cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
> cpu5:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu5: 256KB 64b/line 8-way L2 cache
> cpu5: smt 1, core 1, package 0
> cpu6 at mainbus0: apid 5 (application processor)
> cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.94 MHz, 06-3f-00
> cpu6:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu6: 256KB 64b/line 8-way L2 cache
> cpu6: smt 1, core 2, package 0
> cpu7 at mainbus0: apid 7 (application processor)
> cpu7: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.91 MHz, 06-3f-00
> cpu7:
> 

Re: vmd losing VMs

2018-10-02 Thread Greg Steuck
Dmitry, is there an easy way get at the VMs output?

Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for
RT3290 chipset by James Hastings."). No lockup thus far.

OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct  2 10:37:13 PDT 2018
syzkaller@ci-openbsd.syzkaller
:/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG
real mem = 17163079680 (16367MB)
avail mem = 16633610240 (15863MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries)
bios0: vendor Google version "Google" date 01/01/2011
bios0: Google Google Compute Engine
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC WAET SRAT
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2300.55 MHz, 06-3f-00
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 990MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.87 MHz, 06-3f-00
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.89 MHz, 06-3f-00
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
cpu4 at mainbus0: apid 1 (application processor)
cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
cpu4:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu4: 256KB 64b/line 8-way L2 cache
cpu4: smt 1, core 0, package 0
cpu5 at mainbus0: apid 3 (application processor)
cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00
cpu5:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu5: 256KB 64b/line 8-way L2 cache
cpu5: smt 1, core 1, package 0
cpu6 at mainbus0: apid 5 (application processor)
cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.94 MHz, 06-3f-00
cpu6:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu6: 256KB 64b/line 8-way L2 cache
cpu6: smt 1, core 2, package 0
cpu7 at mainbus0: apid 7 (application processor)
cpu7: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.91 MHz, 06-3f-00
cpu7:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu7: 256KB 64b/line 8-way L2 cache
cpu7: smt 1, core 3, package 0
ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpicpu2 at acpi0: C1(@1 halt!)
acpicpu3 at acpi0: C1(@1 halt!)

Re: vmd losing VMs

2018-10-02 Thread Reyk Floeter
Ok thanks.  I wonder if there is a tool that can log everything from
the VM's console ... It would be useful to see what happens to the VM
before it goes into zombie mode.

Reyk

On Tue, Oct 02, 2018 at 10:33:11AM -0700, Greg Steuck wrote:
> I believe I was unable to ssh into them or get cu to elicit any characters.
> I'll verify next time it happens.
> 
> On Tue, Oct 2, 2018 at 10:20 AM Reyk Floeter  wrote:
> 
> > On Tue, Oct 02, 2018 at 10:10:41AM -0700, Greg Steuck wrote:
> > > Naturally, bugs don't solve themselves :) Here's a log, it's not very
> > > useful due to the lack of debugging symbols. Notice, that runaway vmds
> > > don't die on their own, they just spin out of control. I'll do VMM_DEBUG
> > > next.
> > >
> >
> > "they just spin out of control" - maybe I've missed the previous
> > details, but do you know what happens to these VMs?  Are they stuck in
> > ddb or in a reboot loop?
> >
> > Reyk
> >
> > > ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
> > > kern.nosuidcoredump: 1 -> 3
> > > ci-openbsd#  sysctl kern.nosuidcoredump
> > > kern.nosuidcoredump=3
> > > ci-openbsd# ps ax | grep vmd
> > > 32653 ??  Isp 0:01.28 /usr/sbin/vmd
> > > 89585 ??  Is  0:00.14 vmd: priv (vmd)
> > > 50191 ??  Isp 0:01.88 vmd: control (vmd)
> > > 33160 ??  Isp 0:08.55 vmd: vmm (vmd)
> > > 52853 ??  Rp/1  280:12.56 vmd: ci-openbsd-main-1 (vmd)
> > >  3238 ??  Rp/2   48:56.13 vmd: ci-openbsd-main-0 (vmd)
> > > 44625 ??  Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd)
> > > 42187 p5  R+p/0   0:00.00 grep vmd (ksh)
> > > ci-openbsd# vmctl status
> > >ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
> > >   239 44625 1512M181M   ttyp2syzkaller ci-openbsd-main-2
> > > 1 - 1512M   -   -syzkaller syzkaller
> > > ci-openbsd# kill 52853
> > > ci-openbsd# ps ax | grep vmd
> > > 32653 ??  Ssp 0:01.28 /usr/sbin/vmd
> > > 89585 ??  Is  0:00.14 vmd: priv (vmd)
> > > 50191 ??  Ssp 0:01.88 vmd: control (vmd)
> > > 33160 ??  Ssp 0:08.55 vmd: vmm (vmd)
> > > 52853 ??  Rp/1  280:30.24 vmd: ci-openbsd-main-1 (vmd)
> > >  3238 ??  Rp/2   49:14.27 vmd: ci-openbsd-main-0 (vmd)
> > > 44625 ??  Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd)
> > > 27783 p5  R+p/1   0:00.00 grep vmd (ksh)
> > > ci-openbsd# ps ax | grep syz
> > > 50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
> > > /syzkaller/ramdisk
> > > 70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
> > >   957 ??  S   0:01.09 sshd: syzkaller@ttyp3 (sshd)
> > > 28743 ??  I  18:00.14 syzkaller/current/bin/syz-manager -config
> > > /syzkaller/managers/main/current/manager.cfg
> > >  5869 ??  Ip  0:00.07 ssh -p 22 -i
> > /syzkaller/managers/main/current/key
> > > -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi
> > > 57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
> > > 77889 ??  S   0:00.17 sshd: syzkaller@ttyp4 (sshd)
> > > 39218 p5  R+/00:00.00 grep syz
> > > 50644 00- D   4:09.87 ./syz-ci -config ./config-openbsd.ci
> > > 60603 00- Ip  0:00.05 tee syz-ci.log
> > > ci-openbsd# kill 50644 28743
> > > ci-openbsd# ps ax | grep syz
> > > 50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
> > > /syzkaller/ramdisk
> > > 70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
> > >   957 ??  I   0:01.09 sshd: syzkaller@ttyp3 (sshd)
> > > 57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
> > > 77889 ??  S   0:00.18 sshd: syzkaller@ttyp4 (sshd)
> > > 15816 p5  R+/10:00.00 grep syz
> > > ci-openbsd# rcctl stop vmd
> > > vmd(ok)
> > > ci-openbsd# ps ax | grep vmd
> > > 52853 ??  Rp/1  281:59.20 vmd: ci-openbsd-main-1 (vmd)
> > >  3238 ??  Rp/2   50:42.87 vmd: ci-openbsd-main-0 (vmd)
> > > 19166 p5  R+/00:00.00 grep vmd
> > > ci-openbsd# kill -ABRT 52853
> > > ci-openbsd# ps ax | grep vmd
> > >  3238 ??  Rp/2   52:06.55 vmd: ci-openbsd-main-0 (vmd)
> > > 55423 p5  S+p 0:00.01 grep vmd
> > > ci-openbsd# dmesg | tail
> > > pms0 at pckbc0 (aux slot)
> > > wsmouse0 at pms0 mux 0
> > > pcppi0 at isa0 port 0x61
> > > spkr0 at pcppi0
> > > vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation)
> > > vscsi0 at root
> > > scsibus2 at vscsi0: 256 targets
> > > softraid0 at root
> > > scsibus3 at softraid0: 256 targets
> > > root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b
> > > ci-openbsd# ps ax | grep vmd
> > >  3238 ??  Rp/2   52:06.88 vmd: ci-openbsd-main-0 (vmd)
> > > 19175 p5  R+/00:00.00 grep vmd
> > > ci-openbsd# kill 3238
> > > ci-openbsd# ps ax | grep vmd
> > >  3238 ??  Rp/2   52:18.52 vmd: ci-openbsd-main-0 (vmd)
> > > 27516 p5  R+/30:00.00 grep vmd
> > > ci-openbsd# kill -ABRT 3238
> > > ci-openbsd# ps ax | grep vmd
> > >  3238 ??  Rp/2   52:27.71 (vmd)
> > > 93083 p5  R+/30:00.00 grep vmd
> > > ci-openbsd# ps ax | grep vmd
> > > 95984 p5  S+p 0:00.01 grep vmd
> > > ci-openbsd# ls -l /var/crash/vmd
> > > total 668864
> > > -rw---  1 root  wheel  200320568 

Re: vmd losing VMs

2018-10-02 Thread Greg Steuck
I believe I was unable to ssh into them or get cu to elicit any characters.
I'll verify next time it happens.

On Tue, Oct 2, 2018 at 10:20 AM Reyk Floeter  wrote:

> On Tue, Oct 02, 2018 at 10:10:41AM -0700, Greg Steuck wrote:
> > Naturally, bugs don't solve themselves :) Here's a log, it's not very
> > useful due to the lack of debugging symbols. Notice, that runaway vmds
> > don't die on their own, they just spin out of control. I'll do VMM_DEBUG
> > next.
> >
>
> "they just spin out of control" - maybe I've missed the previous
> details, but do you know what happens to these VMs?  Are they stuck in
> ddb or in a reboot loop?
>
> Reyk
>
> > ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
> > kern.nosuidcoredump: 1 -> 3
> > ci-openbsd#  sysctl kern.nosuidcoredump
> > kern.nosuidcoredump=3
> > ci-openbsd# ps ax | grep vmd
> > 32653 ??  Isp 0:01.28 /usr/sbin/vmd
> > 89585 ??  Is  0:00.14 vmd: priv (vmd)
> > 50191 ??  Isp 0:01.88 vmd: control (vmd)
> > 33160 ??  Isp 0:08.55 vmd: vmm (vmd)
> > 52853 ??  Rp/1  280:12.56 vmd: ci-openbsd-main-1 (vmd)
> >  3238 ??  Rp/2   48:56.13 vmd: ci-openbsd-main-0 (vmd)
> > 44625 ??  Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd)
> > 42187 p5  R+p/0   0:00.00 grep vmd (ksh)
> > ci-openbsd# vmctl status
> >ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
> >   239 44625 1512M181M   ttyp2syzkaller ci-openbsd-main-2
> > 1 - 1512M   -   -syzkaller syzkaller
> > ci-openbsd# kill 52853
> > ci-openbsd# ps ax | grep vmd
> > 32653 ??  Ssp 0:01.28 /usr/sbin/vmd
> > 89585 ??  Is  0:00.14 vmd: priv (vmd)
> > 50191 ??  Ssp 0:01.88 vmd: control (vmd)
> > 33160 ??  Ssp 0:08.55 vmd: vmm (vmd)
> > 52853 ??  Rp/1  280:30.24 vmd: ci-openbsd-main-1 (vmd)
> >  3238 ??  Rp/2   49:14.27 vmd: ci-openbsd-main-0 (vmd)
> > 44625 ??  Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd)
> > 27783 p5  R+p/1   0:00.00 grep vmd (ksh)
> > ci-openbsd# ps ax | grep syz
> > 50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
> > /syzkaller/ramdisk
> > 70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
> >   957 ??  S   0:01.09 sshd: syzkaller@ttyp3 (sshd)
> > 28743 ??  I  18:00.14 syzkaller/current/bin/syz-manager -config
> > /syzkaller/managers/main/current/manager.cfg
> >  5869 ??  Ip  0:00.07 ssh -p 22 -i
> /syzkaller/managers/main/current/key
> > -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi
> > 57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
> > 77889 ??  S   0:00.17 sshd: syzkaller@ttyp4 (sshd)
> > 39218 p5  R+/00:00.00 grep syz
> > 50644 00- D   4:09.87 ./syz-ci -config ./config-openbsd.ci
> > 60603 00- Ip  0:00.05 tee syz-ci.log
> > ci-openbsd# kill 50644 28743
> > ci-openbsd# ps ax | grep syz
> > 50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
> > /syzkaller/ramdisk
> > 70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
> >   957 ??  I   0:01.09 sshd: syzkaller@ttyp3 (sshd)
> > 57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
> > 77889 ??  S   0:00.18 sshd: syzkaller@ttyp4 (sshd)
> > 15816 p5  R+/10:00.00 grep syz
> > ci-openbsd# rcctl stop vmd
> > vmd(ok)
> > ci-openbsd# ps ax | grep vmd
> > 52853 ??  Rp/1  281:59.20 vmd: ci-openbsd-main-1 (vmd)
> >  3238 ??  Rp/2   50:42.87 vmd: ci-openbsd-main-0 (vmd)
> > 19166 p5  R+/00:00.00 grep vmd
> > ci-openbsd# kill -ABRT 52853
> > ci-openbsd# ps ax | grep vmd
> >  3238 ??  Rp/2   52:06.55 vmd: ci-openbsd-main-0 (vmd)
> > 55423 p5  S+p 0:00.01 grep vmd
> > ci-openbsd# dmesg | tail
> > pms0 at pckbc0 (aux slot)
> > wsmouse0 at pms0 mux 0
> > pcppi0 at isa0 port 0x61
> > spkr0 at pcppi0
> > vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation)
> > vscsi0 at root
> > scsibus2 at vscsi0: 256 targets
> > softraid0 at root
> > scsibus3 at softraid0: 256 targets
> > root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b
> > ci-openbsd# ps ax | grep vmd
> >  3238 ??  Rp/2   52:06.88 vmd: ci-openbsd-main-0 (vmd)
> > 19175 p5  R+/00:00.00 grep vmd
> > ci-openbsd# kill 3238
> > ci-openbsd# ps ax | grep vmd
> >  3238 ??  Rp/2   52:18.52 vmd: ci-openbsd-main-0 (vmd)
> > 27516 p5  R+/30:00.00 grep vmd
> > ci-openbsd# kill -ABRT 3238
> > ci-openbsd# ps ax | grep vmd
> >  3238 ??  Rp/2   52:27.71 (vmd)
> > 93083 p5  R+/30:00.00 grep vmd
> > ci-openbsd# ps ax | grep vmd
> > 95984 p5  S+p 0:00.01 grep vmd
> > ci-openbsd# ls -l /var/crash/vmd
> > total 668864
> > -rw---  1 root  wheel  200320568 Oct  2 09:47 3238.core
> > -rw---  1 root  wheel  141988032 Oct  2 09:46 52853.core
> > ci-openbsd# gdb /usr/sb
> > ci-openbsd# file /var/crash/vmd/52853.core
> > /var/crash/vmd/52853.core: ELF 64-bit LSB core file x86-64, version 1
> > ci-openbsd# gdb /usr/sbin/vmd /var/crash/vmd/52853.core
> > GNU gdb 6.3
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you
> are
> > 

Re: vmd losing VMs

2018-10-02 Thread Reyk Floeter
On Tue, Oct 02, 2018 at 10:10:41AM -0700, Greg Steuck wrote:
> Naturally, bugs don't solve themselves :) Here's a log, it's not very
> useful due to the lack of debugging symbols. Notice, that runaway vmds
> don't die on their own, they just spin out of control. I'll do VMM_DEBUG
> next.
> 

"they just spin out of control" - maybe I've missed the previous
details, but do you know what happens to these VMs?  Are they stuck in
ddb or in a reboot loop?

Reyk

> ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
> kern.nosuidcoredump: 1 -> 3
> ci-openbsd#  sysctl kern.nosuidcoredump
> kern.nosuidcoredump=3
> ci-openbsd# ps ax | grep vmd
> 32653 ??  Isp 0:01.28 /usr/sbin/vmd
> 89585 ??  Is  0:00.14 vmd: priv (vmd)
> 50191 ??  Isp 0:01.88 vmd: control (vmd)
> 33160 ??  Isp 0:08.55 vmd: vmm (vmd)
> 52853 ??  Rp/1  280:12.56 vmd: ci-openbsd-main-1 (vmd)
>  3238 ??  Rp/2   48:56.13 vmd: ci-openbsd-main-0 (vmd)
> 44625 ??  Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd)
> 42187 p5  R+p/0   0:00.00 grep vmd (ksh)
> ci-openbsd# vmctl status
>ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
>   239 44625 1512M181M   ttyp2syzkaller ci-openbsd-main-2
> 1 - 1512M   -   -syzkaller syzkaller
> ci-openbsd# kill 52853
> ci-openbsd# ps ax | grep vmd
> 32653 ??  Ssp 0:01.28 /usr/sbin/vmd
> 89585 ??  Is  0:00.14 vmd: priv (vmd)
> 50191 ??  Ssp 0:01.88 vmd: control (vmd)
> 33160 ??  Ssp 0:08.55 vmd: vmm (vmd)
> 52853 ??  Rp/1  280:30.24 vmd: ci-openbsd-main-1 (vmd)
>  3238 ??  Rp/2   49:14.27 vmd: ci-openbsd-main-0 (vmd)
> 44625 ??  Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd)
> 27783 p5  R+p/1   0:00.00 grep vmd (ksh)
> ci-openbsd# ps ax | grep syz
> 50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
> /syzkaller/ramdisk
> 70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
>   957 ??  S   0:01.09 sshd: syzkaller@ttyp3 (sshd)
> 28743 ??  I  18:00.14 syzkaller/current/bin/syz-manager -config
> /syzkaller/managers/main/current/manager.cfg
>  5869 ??  Ip  0:00.07 ssh -p 22 -i /syzkaller/managers/main/current/key
> -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi
> 57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
> 77889 ??  S   0:00.17 sshd: syzkaller@ttyp4 (sshd)
> 39218 p5  R+/00:00.00 grep syz
> 50644 00- D   4:09.87 ./syz-ci -config ./config-openbsd.ci
> 60603 00- Ip  0:00.05 tee syz-ci.log
> ci-openbsd# kill 50644 28743
> ci-openbsd# ps ax | grep syz
> 50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
> /syzkaller/ramdisk
> 70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
>   957 ??  I   0:01.09 sshd: syzkaller@ttyp3 (sshd)
> 57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
> 77889 ??  S   0:00.18 sshd: syzkaller@ttyp4 (sshd)
> 15816 p5  R+/10:00.00 grep syz
> ci-openbsd# rcctl stop vmd
> vmd(ok)
> ci-openbsd# ps ax | grep vmd
> 52853 ??  Rp/1  281:59.20 vmd: ci-openbsd-main-1 (vmd)
>  3238 ??  Rp/2   50:42.87 vmd: ci-openbsd-main-0 (vmd)
> 19166 p5  R+/00:00.00 grep vmd
> ci-openbsd# kill -ABRT 52853
> ci-openbsd# ps ax | grep vmd
>  3238 ??  Rp/2   52:06.55 vmd: ci-openbsd-main-0 (vmd)
> 55423 p5  S+p 0:00.01 grep vmd
> ci-openbsd# dmesg | tail
> pms0 at pckbc0 (aux slot)
> wsmouse0 at pms0 mux 0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation)
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b
> ci-openbsd# ps ax | grep vmd
>  3238 ??  Rp/2   52:06.88 vmd: ci-openbsd-main-0 (vmd)
> 19175 p5  R+/00:00.00 grep vmd
> ci-openbsd# kill 3238
> ci-openbsd# ps ax | grep vmd
>  3238 ??  Rp/2   52:18.52 vmd: ci-openbsd-main-0 (vmd)
> 27516 p5  R+/30:00.00 grep vmd
> ci-openbsd# kill -ABRT 3238
> ci-openbsd# ps ax | grep vmd
>  3238 ??  Rp/2   52:27.71 (vmd)
> 93083 p5  R+/30:00.00 grep vmd
> ci-openbsd# ps ax | grep vmd
> 95984 p5  S+p 0:00.01 grep vmd
> ci-openbsd# ls -l /var/crash/vmd
> total 668864
> -rw---  1 root  wheel  200320568 Oct  2 09:47 3238.core
> -rw---  1 root  wheel  141988032 Oct  2 09:46 52853.core
> ci-openbsd# gdb /usr/sb
> ci-openbsd# file /var/crash/vmd/52853.core
> /var/crash/vmd/52853.core: ELF 64-bit LSB core file x86-64, version 1
> ci-openbsd# gdb /usr/sbin/vmd /var/crash/vmd/52853.core
> GNU gdb 6.3
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-unknown-openbsd6.4"...(no debugging
> symbols found)
> 
> Core was generated by `vmd'.
> Program terminated with signal 6, Aborted.
> Reading 

vmd losing VMs

2018-10-02 Thread Greg Steuck
Naturally, bugs don't solve themselves :) Here's a log, it's not very
useful due to the lack of debugging symbols. Notice, that runaway vmds
don't die on their own, they just spin out of control. I'll do VMM_DEBUG
next.

ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
kern.nosuidcoredump: 1 -> 3
ci-openbsd#  sysctl kern.nosuidcoredump
kern.nosuidcoredump=3
ci-openbsd# ps ax | grep vmd
32653 ??  Isp 0:01.28 /usr/sbin/vmd
89585 ??  Is  0:00.14 vmd: priv (vmd)
50191 ??  Isp 0:01.88 vmd: control (vmd)
33160 ??  Isp 0:08.55 vmd: vmm (vmd)
52853 ??  Rp/1  280:12.56 vmd: ci-openbsd-main-1 (vmd)
 3238 ??  Rp/2   48:56.13 vmd: ci-openbsd-main-0 (vmd)
44625 ??  Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd)
42187 p5  R+p/0   0:00.00 grep vmd (ksh)
ci-openbsd# vmctl status
   ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
  239 44625 1512M181M   ttyp2syzkaller ci-openbsd-main-2
1 - 1512M   -   -syzkaller syzkaller
ci-openbsd# kill 52853
ci-openbsd# ps ax | grep vmd
32653 ??  Ssp 0:01.28 /usr/sbin/vmd
89585 ??  Is  0:00.14 vmd: priv (vmd)
50191 ??  Ssp 0:01.88 vmd: control (vmd)
33160 ??  Ssp 0:08.55 vmd: vmm (vmd)
52853 ??  Rp/1  280:30.24 vmd: ci-openbsd-main-1 (vmd)
 3238 ??  Rp/2   49:14.27 vmd: ci-openbsd-main-0 (vmd)
44625 ??  Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd)
27783 p5  R+p/1   0:00.00 grep vmd (ksh)
ci-openbsd# ps ax | grep syz
50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
/syzkaller/ramdisk
70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
  957 ??  S   0:01.09 sshd: syzkaller@ttyp3 (sshd)
28743 ??  I  18:00.14 syzkaller/current/bin/syz-manager -config
/syzkaller/managers/main/current/manager.cfg
 5869 ??  Ip  0:00.07 ssh -p 22 -i /syzkaller/managers/main/current/key
-F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi
57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
77889 ??  S   0:00.17 sshd: syzkaller@ttyp4 (sshd)
39218 p5  R+/00:00.00 grep syz
50644 00- D   4:09.87 ./syz-ci -config ./config-openbsd.ci
60603 00- Ip  0:00.05 tee syz-ci.log
ci-openbsd# kill 50644 28743
ci-openbsd# ps ax | grep syz
50771 ??  Is  0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b
/syzkaller/ramdisk
70067 ??  Is  0:00.16 sshd: syzkaller [priv] (sshd)
  957 ??  I   0:01.09 sshd: syzkaller@ttyp3 (sshd)
57895 ??  Is  0:00.08 sshd: syzkaller [priv] (sshd)
77889 ??  S   0:00.18 sshd: syzkaller@ttyp4 (sshd)
15816 p5  R+/10:00.00 grep syz
ci-openbsd# rcctl stop vmd
vmd(ok)
ci-openbsd# ps ax | grep vmd
52853 ??  Rp/1  281:59.20 vmd: ci-openbsd-main-1 (vmd)
 3238 ??  Rp/2   50:42.87 vmd: ci-openbsd-main-0 (vmd)
19166 p5  R+/00:00.00 grep vmd
ci-openbsd# kill -ABRT 52853
ci-openbsd# ps ax | grep vmd
 3238 ??  Rp/2   52:06.55 vmd: ci-openbsd-main-0 (vmd)
55423 p5  S+p 0:00.01 grep vmd
ci-openbsd# dmesg | tail
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation)
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b
ci-openbsd# ps ax | grep vmd
 3238 ??  Rp/2   52:06.88 vmd: ci-openbsd-main-0 (vmd)
19175 p5  R+/00:00.00 grep vmd
ci-openbsd# kill 3238
ci-openbsd# ps ax | grep vmd
 3238 ??  Rp/2   52:18.52 vmd: ci-openbsd-main-0 (vmd)
27516 p5  R+/30:00.00 grep vmd
ci-openbsd# kill -ABRT 3238
ci-openbsd# ps ax | grep vmd
 3238 ??  Rp/2   52:27.71 (vmd)
93083 p5  R+/30:00.00 grep vmd
ci-openbsd# ps ax | grep vmd
95984 p5  S+p 0:00.01 grep vmd
ci-openbsd# ls -l /var/crash/vmd
total 668864
-rw---  1 root  wheel  200320568 Oct  2 09:47 3238.core
-rw---  1 root  wheel  141988032 Oct  2 09:46 52853.core
ci-openbsd# gdb /usr/sb
ci-openbsd# file /var/crash/vmd/52853.core
/var/crash/vmd/52853.core: ELF 64-bit LSB core file x86-64, version 1
ci-openbsd# gdb /usr/sbin/vmd /var/crash/vmd/52853.core
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-unknown-openbsd6.4"...(no debugging
symbols found)

Core was generated by `vmd'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/lib/libpthread.so.25.1...done.
Loaded symbols for /usr/lib/libpthread.so.25.1
Loaded symbols for /usr/sbin/vmd
Reading symbols from /usr/lib/libutil.so.13.0...done.
Loaded symbols for /usr/lib/libutil.so.13.0
Symbols already loaded for /usr/lib/libpthread.so.25.1
Reading symbols from /usr/lib/libevent.so.4.1...done.
Loaded symbols for /usr/lib/libevent.so.4.1
Reading symbols from /usr/lib/libc.so.92.5...done.
Loaded symbols for 

Re: vmd losing VMs

2018-10-02 Thread Mike Larkin
On Mon, Oct 01, 2018 at 10:16:24PM -0700, Greg Steuck wrote:
> Thanks Mike.
> 
> I've upgraded from Sep 27th to Sep 29th snapshot and so far I haven't seen
> the problem with:
> 
> OpenBSD 6.4-beta (GENERIC.MP) #336: Sat Sep 29 08:13:15 MDT 2018
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> pd@ is saying he's not responsible for the fix, but maybe something by reyk@
> is?
> 
> I will apply the debugging tools should the problem recur.
> 
> BTW, the memory copy-pasto fix is working well. I was prevented from
> running 4x2G VMs ;)
> 
> Thanks
> Greg
> -- 
> nest.cx is Gmail hosted, use PGP for anything private. Key:
> http://goo.gl/6dMsr
> Fingerprint: 5E2B 2D0E 1E03 2046 BEC3  4D50 0B15 42BD 8DF5 A1B0

Thanks. Please let me know if you see any other problems.

-ml



Re: vmd losing VMs

2018-10-01 Thread Greg Steuck
Thanks Mike.

I've upgraded from Sep 27th to Sep 29th snapshot and so far I haven't seen
the problem with:

OpenBSD 6.4-beta (GENERIC.MP) #336: Sat Sep 29 08:13:15 MDT 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

pd@ is saying he's not responsible for the fix, but maybe something by reyk@
is?

I will apply the debugging tools should the problem recur.

BTW, the memory copy-pasto fix is working well. I was prevented from
running 4x2G VMs ;)

Thanks
Greg
-- 
nest.cx is Gmail hosted, use PGP for anything private. Key:
http://goo.gl/6dMsr
Fingerprint: 5E2B 2D0E 1E03 2046 BEC3  4D50 0B15 42BD 8DF5 A1B0


Re: vmd losing VMs

2018-09-30 Thread Mike Larkin
On Fri, Sep 28, 2018 at 05:01:27PM -0700, Greg Steuck wrote:
> I've been running syzkaller for about a day now. It launches/kills VMs all
> the time. Somewhere along the way vmd seems to have lost track of one of
> its VMs. Notice how syzkaller ci-openbsd-main-1 is conspicuously missing
> from vmctl status even though there's a process chewing the CPU for it.
> Some of the data may not be perfectly aligned because syzkaller is still
> working, so don't be alarmed that some ids don't quite line up. The
> important part is the runaway PID 49113.
> 
> All of syzkaller machinery is running as syzkaller user, so it shouldn't be
> messing with anything as root. In fact, the whole machine setup is
> automated:
> https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh
> What you see there is literally what's running (modulo missing syzkaller
> config and everything that syzkaller does as an ordinary user).
> 
> I'll keep this system limping along should any debug ideas arise.
> 
> OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018
> 
> ci-openbsd$ vmctl status
>ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
>   236 40918 12.0G335M   ttyp0syzkaller ci-openbsd-main-2
>   235 85341 12.0G398M   ttyp2syzkaller ci-openbsd-main-0
> 1 - 1512M   -   -syzkaller syzkaller
> 

1. Are you getting any vmd cores?
 * sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd
 * kill all vmd and restart the test. Not sure if that sysctl can be done
   after boot or if it needs to be in /etc/sysctl.conf

2. building a host kernel with VMM_DEBUG will be helpful here to see why VMs
   are disappearing.

3. You're running this nested in GCP? Do you know what VMX features they
   expose to guests? Perhaps there is an assumption being made in vmm that
   we have a certain feature not being exposed by the underlying GCP hypervisor
   (although I'm pretty sure that's not the case, might be good to check - 
   VMM_DEBUG will tell us this).

-ml

> ci-openbsd$ dmesg
> OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 32195465216 (30703MB)
> avail mem = 31210467328 (29764MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcc0 (22 entries)
> bios0: vendor Google version "Google" date 01/01/2011
> bios0: Google Google Compute Engine
> acpi0 at bios0: rev 0
> acpi0: sleep states S3 S4 S5
> acpi0: tables DSDT FACP SSDT APIC WAET SRAT
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2070.62 MHz, 06-3f-00
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 1000MHz
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00
> cpu2:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00
> cpu3:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 0, core 3, package 0
> cpu4 at mainbus0: apid 1 (application processor)
> cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00
> cpu4:
> 

Re: vmd losing VMs

2018-09-29 Thread Greg Steuck
Another one bit the dust, ci-openbsd-main-0 this time.

ci-openbsd$ vmctl status
   ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
  390 35277 12.0G474M   ttyp2syzkaller ci-openbsd-main-2
1 - 1512M   -   -syzkaller syzkaller
ci-openbsd$ ps axu | grep vmd
_vmd 35277 109.7  0.3 2100040 98340 ??  Rp/2   9:49AM   12:56.84 vmd:
ci-openbsd-main-2 (vmd)
_vmd 49113 96.5  0.2 2099984 47868 ??  Rp/3  Fri07AM  1572:15.59 vmd:
ci-openbsd-main-1 (vmd)
_vmd 94581 97.4  0.2 2099944 48344 ??  Rp/1   8:31AM   92:02.56 vmd:
ci-openbsd-main-0 (vmd)
_vmd 29877  0.0  0.0  1364  1904 ??  Ssp   Thu01PM0:02.65 vmd:
control (vmd)
_vmd 38711  0.0  0.0  1432  2028 ??  Ssp   Thu01PM0:37.50 vmd: vmm
(vmd)
root 76140  0.0  0.0  1156  1684 ??  IsThu01PM0:00.24 vmd: priv
(vmd)
root 23839  0.0  0.0  1624  2004 ??  Ssp   Thu01PM0:01.97
/usr/sbin/vmd
syzkalle 51031  0.0  0.0   284  1216 p3  S+p   10:04AM0:00.01 grep vmd
ci-openbsd$ netstat -rn
Routing tables

Internet:
DestinationGatewayFlags   Refs  Use   Mtu  Prio
Iface
default10.128.0.1 UGS9 4107 - 8
vio0
224/4  127.0.0.1  URS00 32768 8
lo0
10.128.0.1 42:01:0a:80:00:01  UHLch  1 2998 - 7
vio0
10.128.0.1/32  10.128.0.45UCS10 - 8
vio0
10.128.0.4542:01:0a:80:00:2d  UHLl   0 2859 - 1
vio0
10.128.0.45/32 10.128.0.45UCn00 - 4
vio0
100.64.134.2/31100.64.134.2   UCn03 - 4
tap1
100.64.134.2   fe:e1:ba:d4:93:1e  UHLl   00 - 1
tap1
100.65.132.2/31100.65.132.2   UCn03 - 4
tap0
100.65.132.2   fe:e1:ba:dc:a6:c5  UHLl   00 - 1
tap0
100.65.134.2/31100.65.134.2   UCn13 - 4
tap2
100.65.134.2   fe:e1:ba:de:95:db  UHLl   04 - 1
tap2
100.65.134.3   fe:e1:bb:d1:c8:39  UHLc   2   10 - 3
tap2
127/8  127.0.0.1  UGRS   00 32768 8
lo0
127.0.0.1  127.0.0.1  UHhl   1 4159 32768 1
lo0

ci-openbsd$ grep vmd /var/log/daemon | tail -50
Sep 29 03:49:32 ci-openbsd vmd[23839]: config_setvm: failed to start vm
ci-openbsd-main-2
Sep 29 03:50:10 ci-openbsd vmd[23839]: ci-openbsd-main-2: user 1000 cpu
limit reached
Sep 29 03:50:10 ci-openbsd vmd[23839]: config_setvm: failed to start vm
ci-openbsd-main-2
Sep 29 03:50:41 ci-openbsd vmd[23839]: ci-openbsd-main-2: user 1000 cpu
limit reached
Sep 29 03:50:41 ci-openbsd vmd[23839]: config_setvm: failed to start vm
ci-openbsd-main-2
Sep 29 03:50:53 ci-openbsd vmd[72457]: ci-openbsd-main-test-0:
vcpu_deassert_pic_irq: can't deassert INTR for vm_id 357, vcpu_id 0
Sep 29 03:50:53 ci-openbsd vmd[72457]: vcpu_run_loop: vm 357 / vcpu 0 run
ioctl failed: No such file or directory
Sep 29 03:50:53 ci-openbsd vmd[6507]: ci-openbsd-main-test-1:
vcpu_assert_pic_irq: can't assert INTR
Sep 29 03:50:57 ci-openbsd vmd[90226]: ci-openbsd-main-0:
vcpu_assert_pic_irq: can't assert INTR
Sep 29 03:51:15 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 366
successfully, tty /dev/ttyp0
Sep 29 03:51:15 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 367
successfully, tty /dev/ttyp2
Sep 29 04:52:51 ci-openbsd vmd[37572]: ci-openbsd-main-2: can't clear INTR:
No such file or directory
Sep 29 04:53:19 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 368
successfully, tty /dev/ttyp0
Sep 29 04:53:20 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 369
successfully, tty /dev/ttyp2
Sep 29 05:15:29 ci-openbsd vmd[59496]: ci-openbsd-main-2: can't clear INTR:
No such file or directory
Sep 29 05:16:39 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 370
successfully, tty /dev/ttyp0
Sep 29 05:16:40 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 371
successfully, tty /dev/ttyp2
Sep 29 05:18:01 ci-openbsd vmd[11674]: ci-openbsd-main-2: can't set INTR
Sep 29 05:18:01 ci-openbsd vmd[11674]: ci-openbsd-main-2: can't set INTR:
No such file or directory
Sep 29 05:18:14 ci-openbsd vmd[36211]: ci-openbsd-main-0:
vcpu_assert_pic_irq: can't assert INTR
Sep 29 05:18:16 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 372
successfully, tty /dev/ttyp0
Sep 29 05:18:23 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 373
successfully, tty /dev/ttyp2
Sep 29 05:22:28 ci-openbsd vmd[43464]: ci-openbsd-main-2:
vcpu_deassert_pic_irq: can't deassert INTR for vm_id 365, vcpu_id 0
Sep 29 05:22:39 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 374
successfully, tty /dev/ttyp0
Sep 29 05:23:31 ci-openbsd vmd[11677]: ci-openbsd-main-0: can't set INTR:
No such file or directory
Sep 29 05:23:41 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 375
successfully, tty /dev/ttyp2
Sep 29 05:25:06 ci-openbsd vmd[43334]: ci-openbsd-main-2: 

vmd losing VMs

2018-09-28 Thread Greg Steuck
I've been running syzkaller for about a day now. It launches/kills VMs all
the time. Somewhere along the way vmd seems to have lost track of one of
its VMs. Notice how syzkaller ci-openbsd-main-1 is conspicuously missing
from vmctl status even though there's a process chewing the CPU for it.
Some of the data may not be perfectly aligned because syzkaller is still
working, so don't be alarmed that some ids don't quite line up. The
important part is the runaway PID 49113.

All of syzkaller machinery is running as syzkaller user, so it shouldn't be
messing with anything as root. In fact, the whole machine setup is
automated:
https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh
What you see there is literally what's running (modulo missing syzkaller
config and everything that syzkaller does as an ordinary user).

I'll keep this system limping along should any debug ideas arise.

OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018

ci-openbsd$ vmctl status
   ID   PID VCPUS  MAXMEM  CURMEM TTYOWNER NAME
  236 40918 12.0G335M   ttyp0syzkaller ci-openbsd-main-2
  235 85341 12.0G398M   ttyp2syzkaller ci-openbsd-main-0
1 - 1512M   -   -syzkaller syzkaller

ci-openbsd$ dmesg
OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 32195465216 (30703MB)
avail mem = 31210467328 (29764MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcc0 (22 entries)
bios0: vendor Google version "Google" date 01/01/2011
bios0: Google Google Compute Engine
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC WAET SRAT
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2070.62 MHz, 06-3f-00
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 1000MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
cpu4 at mainbus0: apid 1 (application processor)
cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00
cpu4:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu4: 256KB 64b/line 8-way L2 cache
cpu4: smt 1, core 0, package 0
cpu5 at mainbus0: apid 3 (application processor)
cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.92 MHz, 06-3f-00
cpu5:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN
cpu5: 256KB 64b/line 8-way L2 cache
cpu5: smt 1, core 1, package 0
cpu6 at mainbus0: apid 5 (application processor)
cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz,