Re: vmd losing VMs
Not that I expected anything to change considering the patches submitted, but Oct 21 snapshot (minus "Add support to create and convert disk images from existing images") is similarly afflicted. Any candidate fixes or patches for added logging will be put to test in short order :) ci-openbsd$ dmesg | head OpenBSD 6.4-current (GENERIC.MP) #376: Sun Oct 21 22:46:20 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17163079680 (16367MB) avail mem = 16633651200 (15863MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries) bios0: vendor Google version "Google" date 01/01/2011 bios0: Google Google Compute Engine ci-openbsd$ ps ax | grep vm 75798 ?? Is 0:00.04 vmd: priv (vmd) 70138 ?? Isp 0:02.35 /usr/sbin/vmd 49408 ?? Isp 0:01.25 vmd: vmm (vmd) 87301 ?? Isp 0:04.63 vmd: control (vmd) 83798 ?? Rp/2 874:45.69 vmd: ci-openbsd-main-0 (vmd) 74819 ?? Rp/0 469:43.70 vmd: ci-openbsd-main-1 (vmd) 52785 ?? Rp/38:04.52 vmd: ci-openbsd-main-2 (vmd) 58963 ?? Ip 0:00.02 vmctl stop ci-openbsd-main-0 -f -w 25424 ?? Ip 0:00.02 vmctl stop ci-openbsd-main-1 -f -w 58043 p3 S+p 0:00.01 grep vm
Re: vmd losing VMs
On Wed, Oct 17, 2018 at 9:40 PM, Greg Steuck wrote: > On Wed, Oct 17, 2018 at 12:00 PM Dmitry Vyukov wrote: >> >> On Wed, Oct 17, 2018 at 4:50 AM, Greg Steuck wrote: >> > I think I see some evidence of occasional VM just never finishing >> > booting to >> > the point of running sshd. At least that's what I surmised from the >> > posted >> > manager.log file. >> > https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632 >> >> >> This should be fine in the sense that it should not lead to vmd losing >> VMs. > > > I agree that that vmd shouldn't be losing the VMs in such a case. But some > evidence shows they become unkillable as described in my previous message: > https://marc.info/?l=openbsd-tech=153955188302856=2 But VMM does not have an idea about guest OS, OS booting, etc. For VMM it's just a CPU that runs instruction some stream. There may be no OS at all. So I don't see how this can affect vmd behavior. If killing a VM leads to hangs, then most likely that can equally happen if the kill happens during boot or not during boot. >> If test machine does not come up live within some time frame, we >> kill it and create a new one. > > > How tight a timeout does it have? 10 mins to print IP, and then +20 mins for sshd to come up: https://github.com/google/syzkaller/blob/master/vm/vmm/vmm.go#L215
Re: vmd losing VMs
> Can you net out what the current issues you're facing? I can't seem to grok > the log file posted above (not sure what I'm looking at). From reviewing the > thread, it seems the core dumps are gone but there may (?) be some issues > still? The core dumps are gone indeed. Thanks for the prompt fixes! I'm using the Oct 11 snapshot. The issue that's still vexing is VMs sticking in limbo. There's a vmd process spinning at 100% cpu. vmctl status doesn't show the VM. vmctl stop hangs waiting for a response from vmd. I don't think there's enough logging in vmd. At least I don't see anything revealing in /var/log/daemon despite running with -vv. Pasted the data overlapping with https://marc.info/?l=openbsd-tech=153955188302856=2 as https://gist.github.com/blackgnezdo/ebd728246abe418b1867df1361c95e27. ci-openbsd$ ps ax | grep vm 34144 ?? Is 0:00.39 vmd: priv (vmd) 63860 ?? Isp 0:04.99 /usr/sbin/vmd -vv 4223 ?? Isp 0:06.20 vmd: vmm (vmd) 71772 ?? Isp 0:09.01 vmd: control (vmd) 72373 ?? Rp/3 1136:32.53 vmd: ci-openbsd-main-1 (vmd) 90783 ?? Rp/3 50:24.59 vmd: ci-openbsd-main-2 (vmd) 47967 ?? Rp/0 49:57.49 vmd: ci-openbsd-main-0 (vmd) 55129 ?? Ip 0:00.02 vmctl stop ci-openbsd-main-1 -f -w I could work around this in a classic sysadmin fashion by writing a script which greps for any long running vmctl stop processes, then kill the corresponding vmd. I think this works, but I also suspect that internal vmd accounting which imposes the 4 VMs limit will soon get in the way. BTW, we could use a higher VM count limit. I don't want it badly enough to implement config passing code or run the VMs as root, hi Reyk :) The log was from syz-manager, so it was mostly for Dmitry's benefit. Thanks Greg
Re: vmd losing VMs
On Wed, Oct 17, 2018 at 12:00 PM Dmitry Vyukov wrote: > On Wed, Oct 17, 2018 at 4:50 AM, Greg Steuck wrote: > > I think I see some evidence of occasional VM just never finishing > booting to > > the point of running sshd. At least that's what I surmised from the > posted > > manager.log file. > > https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632 > > > This should be fine in the sense that it should not lead to vmd losing VMs. I agree that that vmd shouldn't be losing the VMs in such a case. But some evidence shows they become unkillable as described in my previous message: https://marc.info/?l=openbsd-tech=153955188302856=2 > If test machine does not come up live within some time frame, we > kill it and create a new one. How tight a timeout does it have? Thanks Greg
Re: vmd losing VMs
On Wed, Oct 17, 2018 at 4:50 AM, Greg Steuck wrote: > I think I see some evidence of occasional VM just never finishing booting to > the point of running sshd. At least that's what I surmised from the posted > manager.log file. > https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632 This should be fine in the sense that it should not lead to vmd losing VMs. If test machine does not come up live within some time frame, we kill it and create a new one. > Here's an excerpt: > 2018/10/16 19:35:08 VMs 2, executed 68322, cover 18582, crashes 78, repro 0 > 2018/10/16 19:35:13 failed to create instance: can't ssh into the instance: > failed to run ["ssh" "-p" "22" "-i" "/syzkaller/managers/main/current/key" > "-F" "/dev/null" "-o" "UserKnownHostsFile=/dev/null" "-o" "BatchMode=yes" > "-o" "IdentitiesOnly=yes" "-o" "StrictHostKeyChecking=no" "-o" > "ConnectTimeout=10" "-i" "/syzkaller/managers/main/current/key" > "root@100.66.10.3" "pwd"]: exit status 255 > ssh: connect to host 100.66.10.3 port 22: Operation timed out > > [ using 1975784 bytes of bsd ELF symbol table ] > Copyright (c) 1982, 1986, 1989, 1991, 1993 > The Regents of the University of California. All rights reserved. > Copyright (c) 1995-2018 OpenBSD. All rights reserved. > https://www.OpenBSD.org > OpenBSD 6.4-current (SYZKALLER) #73: Tue Oct 16 10:17:58 PDT 2018 > > root@ci-openbsd.syzkaller:/syzkaller/managers/main/kernel/sys/arch/amd64/compile/SYZKALLER > real mem = 520093696 (496MB) > avail mem = 493764608 (470MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0 > acpi at bios0 not configured > cpu0 at mainbus0: (uniprocessor) > cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2335.02 MHz, 06-3f-00 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > pvbus0 at mainbus0: OpenBSD > pci0 at mainbus0 bus 0 > pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 > virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 > viornd0 at virtio0 > virtio0: irq 3 > virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00 > vioblk0 at virtio1 > scsibus1 at vioblk0: 2 targets > sd0 at scsibus1 targ 0 lun 0: SCSI3 0/direct fixed > sd0: 1024MB, 512 bytes/sector, 2097152 sectors > virtio1: irq 5 > virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00 > vio0 at virtio2: address fe:e1:bb:d1:d0:3f > virtio2: irq 6 > virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 > vmmci0 at virtio3 > virtio3: irq 7 > isa0 at mainbus0 > isadma0 at isa0 > com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo > com0: console > vscsi0 at root > scsibus2 at vscsi0: 256 targets > softraid0 at root > scsibus3 at softraid0: 256 targets > root on sd0a (e12e74b46a21f42a.a) swap on sd0b dump on sd0b > Automatic boot in progress: starting file system checks. > /dev/sd0a (e12e74b46a21f42a.a): file system is clean; not checking > setting tty flags > starting network > vio0: bound to 100.66.10.3 from 100.66.10.2 (fe:e1:bb:d1:d0:40) > starting early daemons: syslogd. > starting RPC daemons:. > savecore: no core dump > acpidump: Can't find ACPI information > checking quotas: done. > clearing /tmp > kern.securelevel: 0 -> 1 > creating runtime link editor directory cache. > preserving editor files. > 2018/10/16 19:35:18 VMs 2, executed 68355, cover 18582, crashes 78, repro 0 > 2018/10/16 19:35:18 hub sync: send: add 1, del 0, repros 0; recv: progs 0, > repros 0; more 0 > 2018/10/16 19:35:28 VMs 2, executed 68406, cover 18582, crashes 78, repro 0 > 2018/10/16 19:35:38 VMs 2, executed 68451, cover 18582, crashes 78, repro 0 > 2018/10/16 19:35:48 VMs 2, executed 68479, cover 18582, crashes 78, repro 0 > 2018/10/16 19:35:58 VMs 2, executed 68501, cover 18582, crashes 78, repro 0 > 2018/10/16 19:36:08 VMs 2, executed 68569, cover 18582, crashes 78, repro 0 > 2018/10/16 19:36:16 failed to create instance: vmm exited > vmctl: start vm command failed: Too many processes > > > > On Wed, Oct 10, 2018 at 3:06 AM Dmitry Vyukov wrote: >> >> On Tue, Oct 2, 2018 at 8:02 PM, Greg Steuck wrote: >> > Dmitry, is there an easy way get at the VMs output? >> >> >> You mean th
Re: vmd losing VMs
I think I see some evidence of occasional VM just never finishing booting to the point of running sshd. At least that's what I surmised from the posted manager.log file. https://gist.github.com/blackgnezdo/a69e83c42c0c4cbbd53c7f3b35e91632 Here's an excerpt: 2018/10/16 19:35:08 VMs 2, executed 68322, cover 18582, crashes 78, repro 0 2018/10/16 19:35:13 failed to create instance: can't ssh into the instance: failed to run ["ssh" "-p" "22" "-i" "/syzkaller/managers/main/current/key" "-F" "/dev/null" "-o" "UserKnownHostsFile=/dev/null" "-o" "BatchMode=yes" "-o" "IdentitiesOnly=yes" "-o" "StrictHostKeyChecking=no" "-o" "ConnectTimeout=10" "-i" "/syzkaller/managers/main/current/key" " root@100.66.10.3" "pwd"]: exit status 255 ssh: connect to host 100.66.10.3 port 22: Operation timed out [ using 1975784 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2018 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 6.4-current (SYZKALLER) #73: Tue Oct 16 10:17:58 PDT 2018 root@ci-openbsd.syzkaller :/syzkaller/managers/main/kernel/sys/arch/amd64/compile/SYZKALLER real mem = 520093696 (496MB) avail mem = 493764608 (470MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0 acpi at bios0 not configured cpu0 at mainbus0: (uniprocessor) cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2335.02 MHz, 06-3f-00 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 pvbus0 at mainbus0: OpenBSD pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 viornd0 at virtio0 virtio0: irq 3 virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00 vioblk0 at virtio1 scsibus1 at vioblk0: 2 targets sd0 at scsibus1 targ 0 lun 0: SCSI3 0/direct fixed sd0: 1024MB, 512 bytes/sector, 2097152 sectors virtio1: irq 5 virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00 vio0 at virtio2: address fe:e1:bb:d1:d0:3f virtio2: irq 6 virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 vmmci0 at virtio3 virtio3: irq 7 isa0 at mainbus0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo com0: console vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (e12e74b46a21f42a.a) swap on sd0b dump on sd0b Automatic boot in progress: starting file system checks. /dev/sd0a (e12e74b46a21f42a.a): file system is clean; not checking setting tty flags starting network vio0: bound to 100.66.10.3 from 100.66.10.2 (fe:e1:bb:d1:d0:40) starting early daemons: syslogd. starting RPC daemons:. savecore: no core dump acpidump: Can't find ACPI information checking quotas: done. clearing /tmp kern.securelevel: 0 -> 1 creating runtime link editor directory cache. preserving editor files. 2018/10/16 19:35:18 VMs 2, executed 68355, cover 18582, crashes 78, repro 0 2018/10/16 19:35:18 hub sync: send: add 1, del 0, repros 0; recv: progs 0, repros 0; more 0 2018/10/16 19:35:28 VMs 2, executed 68406, cover 18582, crashes 78, repro 0 2018/10/16 19:35:38 VMs 2, executed 68451, cover 18582, crashes 78, repro 0 2018/10/16 19:35:48 VMs 2, executed 68479, cover 18582, crashes 78, repro 0 2018/10/16 19:35:58 VMs 2, executed 68501, cover 18582, crashes 78, repro 0 2018/10/16 19:36:08 VMs 2, executed 68569, cover 18582, crashes 78, repro 0 2018/10/16 19:36:16 failed to create instance: vmm exited vmctl: start vm command failed: Too many processes On Wed, Oct 10, 2018 at 3:06 AM Dmitry Vyukov wrote: > On Tue, Oct 2, 2018 at 8:02 PM, Greg Steuck wrote: > > Dmitry, is there an easy way get at the VMs output? > > > You mean the test VM instances (vmd), right? > > We capture vmm/kernel output for crashes, you can see it in the Log > columns for each crash here: > https://syzkaller.appspot.com/#openbsd > > Also if syz-manager is started with -debug flag, then it dumps > everything into terminal in real time, including vmm/kernel output. > This is intended for debugging of reliable mis-behaviors (the thing is > not working at all). > > > > > > Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for > RT3290 > > chipset by James Hastings."). No lockup thus far. > > > > OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct 2 10:37:13 PDT 2018 > > > > syzkaller@ci-openbsd.syzkaller > :/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG > > real mem = 17163079680 (16367MB) > > avail mem = 16633610240 (15863MB) > > mpath0 at root > > scsibus0 at mpath0: 256 targets > > mainbus0 at root > > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries) > > bios0: vendor Google version "Google" date 01/01/2011 > >
Re: vmd losing VMs
Now that I'm running OpenBSD 6.4 (GENERIC.MP) #362: Thu Oct 11 04:53:41 MDT 2018, I can start debugging again. I just observed an interesting tidbit which I failed to notice before. Namely, there are also hanging vmctl processes trying to stop those spinning VMs. So, I tried to reproduce this myself. The first attempt shows, that vmd is somewhat aware of the VM presence (even though it doesn't report it in vmctl status). ci-openbsd$ vmctl status ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME 1 - 1512M - -syzkaller syzkaller ci-openbsd$ ./obj/vmctl stop ci-openbsd-main-2x -f -w stopping vm ci-openbsd-main-2x: vm not found ^^^ - Here, a random VM name is refused. OTOH, when trying to stop a previously known (and currently spinning) VM, it causes a hang in imsg_read. ci-openbsd$ gdb -q -- /syzkaller/src/usr.sbin/vmctl/obj/vmctl (gdb) run stop ci-openbsd-main-2 -f -w Starting program: /syzkaller/src/usr.sbin/vmctl/obj/vmctl stop ci-openbsd-main-2 -f -w stopping vm ci-openbsd-main-2: ^C Current language: auto; currently asm (gdb) where #0 _thread_sys_recvmsg () at -:3 #1 0x1c25828acd6e in _libc_recvmsg_cancel (fd=Variable "fd" is not available. ) at /usr/src/lib/libc/sys/w_recvmsg.c:27 #2 0x1c2520999521 in imsg_read (ibuf=0x1c24fc4ed000) at /usr/src/lib/libutil/imsg.c:82 #3 0x1c22f490392c in vmmaction (res=Variable "res" is not available. ) at /syzkaller/src/usr.sbin/vmctl/main.c:273 #4 0x1c22f4902fd2 in ctl_stop (res=0x7f7e6f80, argc=Variable "argc" is not available. ) at /syzkaller/src/usr.sbin/vmctl/main.c:793 #5 0x1c22f490351e in parse (argc=4, argv=Variable "argv" is not available. ) at /syzkaller/src/usr.sbin/vmctl/main.c:172 #6 0x1c22f49033be in main (argc=4, argv=Variable "argv" is not available. ) at /syzkaller/src/usr.sbin/vmctl/main.c:134 (gdb) i-openbsd$ uname -a OpenBSD ci-openbsd.syzkaller 6.4 GENERIC.MP#362 amd64 ci-openbsd$ dmesg | head OpenBSD 6.4 (GENERIC.MP) #362: Thu Oct 11 04:53:41 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17163079680 (16367MB) avail mem = 16633643008 (15863MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries) bios0: vendor Google version "Google" date 01/01/2011 bios0: Google Google Compute Engine ci-openbsd$ uname -a ci-openbsd$ ps ax | grep vm 55596 ?? Ssp 0:04.86 vmd: vmm (vmd) 22978 ?? Is 0:00.22 vmd: priv (vmd) 52555 ?? Ssp 0:13.01 vmd: control (vmd) 17471 ?? Ssp 0:06.15 /usr/sbin/vmd 29044 ?? Rp/0 2197:50.09 vmd: ci-openbsd-main-2 (vmd) 52266 ?? Rp/1 257:11.58 vmd: ci-openbsd-main-1 (vmd) 15989 ?? Rp/1 241:18.45 vmd: ci-openbsd-main-0 (vmd) 19071 ?? Ip 0:00.02 vmctl stop ci-openbsd-main-1 -f -w 88222 ?? Ip 0:00.02 vmctl stop ci-openbsd-main-0 -f -w 6142 ?? Sp 0:00.01 vmctl stop ci-openbsd-main-2 -f -w ci-openbsd$ dmesg |tail pms0 at pckbc0 (aux slot) wsmouse0 at pms0 mux 0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (11584d676adca97e.a) swap on sd0b dump on sd0b ci-openbsd$ ct 13 19:38:41 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm 330 successfully, tty /dev/ttyp0 Oct 13 20:16:56 ci-openbsd vmd[29583]: ci-openbsd-main-1: vcpu_assert_pic_irq: can't assert INTR Oct 13 20:17:00 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm 331 successfully, tty /dev/ttyp0 Oct 13 20:22:04 ci-openbsd vmd[31844]: vcpu_run_loop: vm 320 / vcpu 0 run ioctl failed: No such file or directory Oct 13 20:22:07 ci-openbsd vmd[17471]: ci-openbsd-main-0: started vm 332 successfully, tty /dev/ttyp2 Oct 13 21:12:18 ci-openbsd vmd[57830]: ci-openbsd-main-1: can't clear INTR: No such file or directory Oct 13 21:12:22 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm 333 successfully, tty /dev/ttyp0 Oct 13 21:23:39 ci-openbsd vmd[81046]: ci-openbsd-main-0: can't clear INTR: No such file or directory Oct 13 21:23:42 ci-openbsd vmd[17471]: ci-openbsd-main-0: started vm 334 successfully, tty /dev/ttyp2 Oct 13 21:43:42 ci-openbsd vmd[59472]: ci-openbsd-main-0: can't clear INTR: No such file or directory Oct 13 21:43:47 ci-openbsd vmd[17471]: ci-openbsd-main-0: started vm 335 successfully, tty /dev/ttyp2 Oct 13 21:52:36 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm 336 successfully, tty /dev/ttyp0 Oct 13 22:06:47 ci-openbsd vmd[58824]: ci-openbsd-main-1: vcpu_assert_pic_irq: can't assert INTR Oct 13 22:06:51 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm 337 successfully, tty /dev/ttyp0 Oct 13 22:13:31 ci-openbsd vmd[6946]: ci-openbsd-main-1: vcpu_assert_pic_irq: can't assert INTR Oct 13 22:13:35 ci-openbsd vmd[17471]: ci-openbsd-main-1: started vm 338 successfully, tty /dev/ttyp0 Oct 13 22:45:14 ci-openbsd vmd[45351]: ci-openbsd-main-0: vcpu_assert_pic_irq:
Re: vmd losing VMs
On Tue, Oct 2, 2018 at 8:02 PM, Greg Steuck wrote: > Dmitry, is there an easy way get at the VMs output? You mean the test VM instances (vmd), right? We capture vmm/kernel output for crashes, you can see it in the Log columns for each crash here: https://syzkaller.appspot.com/#openbsd Also if syz-manager is started with -debug flag, then it dumps everything into terminal in real time, including vmm/kernel output. This is intended for debugging of reliable mis-behaviors (the thing is not working at all). > Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for RT3290 > chipset by James Hastings."). No lockup thus far. > > OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct 2 10:37:13 PDT 2018 > > syzkaller@ci-openbsd.syzkaller:/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG > real mem = 17163079680 (16367MB) > avail mem = 16633610240 (15863MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries) > bios0: vendor Google version "Google" date 01/01/2011 > bios0: Google Google Compute Engine > acpi0 at bios0: rev 0 > acpi0: sleep states S3 S4 S5 > acpi0: tables DSDT FACP SSDT APIC WAET SRAT > acpi0: wakeup devices > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2300.55 MHz, 06-3f-00 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 990MHz > cpu1 at mainbus0: apid 2 (application processor) > cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 0, core 1, package 0 > cpu2 at mainbus0: apid 4 (application processor) > cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.87 MHz, 06-3f-00 > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 2, package 0 > cpu3 at mainbus0: apid 6 (application processor) > cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.89 MHz, 06-3f-00 > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu3: 256KB 64b/line 8-way L2 cache > cpu3: smt 0, core 3, package 0 > cpu4 at mainbus0: apid 1 (application processor) > cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 > cpu4: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu4: 256KB 64b/line 8-way L2 cache > cpu4: smt 1, core 0, package 0 > cpu5 at mainbus0: apid 3 (application processor) > cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 > cpu5: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu5: 256KB 64b/line 8-way L2 cache > cpu5: smt 1, core 1, package 0 > cpu6 at mainbus0: apid 5 (application processor) > cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.94 MHz, 06-3f-00 > cpu6: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu6: 256KB 64b/line 8-way L2 cache > cpu6: smt 1, core 2, package 0 > cpu7 at mainbus0: apid 7 (application processor) > cpu7: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.91 MHz, 06-3f-00 > cpu7: >
Re: vmd losing VMs
All my vmds are stuck after syzkaller was running for some 5 days (and found a couple of bugs!). I'll reinstall the system to get a fresh baseline. > 1. Are you getting any vmd cores? > * sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd > * kill all vmd and restart the test. Not sure if that sysctl can be done >after boot or if it needs to be in /etc/sysctl.conf After applying the fixes for the two bugs that y'all promptly fixed I'm no longer getting any core dumps. > 2. building a host kernel with VMM_DEBUG will be helpful here to see why VMs > are disappearing. Done. Still running the kernel from a few days back. > 3. You're running this nested in GCP? Do you know what VMX features they > expose to guests? Perhaps there is an assumption being made in vmm that > we have a certain feature not being exposed by the underlying GCP hypervisor > (although I'm pretty sure that's not the case, might be good to check - > VMM_DEBUG will tell us this). ci-openbsd$ ps ax | grep vmd 39771 ?? Ssp 0:06.67 vmd: vmm (vmd) 7674 ?? Is 0:00.25 vmd: priv (vmd) 44564 ?? Ssp 0:22.78 vmd: control (vmd) 38905 ?? Ssp 0:13.43 /usr/sbin/vmd 88755 ?? Rp/3 4610:25.87 vmd: ci-openbsd-main-0 (vmd) 9636 ?? Rp/1 4360:16.00 vmd: ci-openbsd-main-2 (vmd) 59918 ?? Rp/1 3559:18.43 vmd: ci-openbsd-main-1 (vmd) The VMs are unpinagable: ci-openbsd$ netstat -rn Routing tables Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default10.128.0.1 UGS812002 - 8 vio0 224/4 127.0.0.1 URS00 32768 8 lo0 10.128.0.1 42:01:0a:80:00:01 UHLch 1 8300 - 7 vio0 10.128.0.1/32 10.128.0.63UCS10 - 8 vio0 10.128.0.6342:01:0a:80:00:3f UHLl 0 9050 - 1 vio0 10.128.0.63/32 10.128.0.63UCn00 - 4 vio0 100.65.69.2/31 100.65.69.2UCn03 - 4 tap2 100.65.69.2fe:e1:ba:d7:6a:76 UHLl 00 - 1 tap2 100.65.104.2/31100.65.104.2 UCn03 - 4 tap0 100.65.104.2 fe:e1:ba:d8:6a:68 UHLl 01 - 1 tap0 100.65.212.2/31100.65.212.2 UCn03 - 4 tap1 100.65.212.2 fe:e1:ba:da:00:a0 UHLl 01 - 1 tap1 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UHhl 1 47 32768 1 lo0 ci-openbsd$ ping 100.65.69.3 PING 100.65.69.3 (100.65.69.3): 56 data bytes ^C --- 100.65.69.3 ping statistics --- 2 packets transmitted, 0 packets received, 100.0% packet loss ci-openbsd$ ping 100.65.104.3 PING 100.65.104.3 (100.65.104.3): 56 data bytes ^C --- 100.65.104.3 ping statistics --- 2 packets transmitted, 0 packets received, 100.0% packet loss ci-openbsd$ ping 100.65.212.3 PING 100.65.212.3 (100.65.212.3): 56 data bytes ^C --- 100.65.212.3 ping statistics --- 3 packets transmitted, 0 packets received, 100.0% packet loss ci-openbsd$ I tried to attach to them but I don't think the results are very satisfactory: (gdb) attach 59918 Attaching to program: /usr/sbin/vmd, process 59918 ptrace: No such process. (gdb) attach 88755 Attaching to program: /usr/sbin/vmd, process 88755 Couldn't get registers: Device busy. Couldn't get registers: Device busy. (gdb) Reading symbols from /usr/lib/libutil.so.13.0...done. Reading symbols from /usr/lib/libevent.so.4.1...done. Reading symbols from /usr/lib/libc.so.92.5...done. Reading symbols from /usr/libexec/ld.so...done. [New thread 376076] [New thread 433606] [Switching to thread 152280] 0x197245b834f5 in event_queue_insert (base=0x19719ae76400, ev=, queue=8) at /usr/src/lib/libevent/event.c:954 954 /usr/src/lib/libevent/event.c: No such file or directory. where #0 0x197245b834f5 in event_queue_insert (base=0x19719ae76400, ev=, queue=8) at /usr/src/lib/libevent/event.c:954 #1 event_active (ev=, res=1, ncalls=1) at /usr/src/lib/libevent/event.c:806 #2 timeout_process (base=) at /usr/src/lib/libevent/event.c:900 #3 event_base_loop (base=0x19719ae76400, flags=0) at /usr/src/lib/libevent/event.c:499 #4 0x196f8940eeaf in ?? () #5 0x197277b6cdce in _rthread_start (v=0x19719ae76400) at /usr/src/lib/librthread/rthread.c:96 #6 0x19726e03db0b in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 #7 0x in ?? () (gdb) info threads Id Target Id Frame * 1thread 152280 0x197245b834f5 in event_queue_insert (base=0x19719ae76400, ev=, queue=8) at /usr/src/lib/libevent/event.c:954 2thread 376076 futex () at -:3 3thread 433606 futex () at -:3 (gdb) thread 2 [Switching to thread 2 (thread 376076)] #0 futex () at -:3 3 -: No such file or directory. (gdb) bt #0 futex () at -:3 #1 0x19726e08fed5 in _rthread_cond_timedwait
Re: vmd losing VMs
On Tue, Oct 02, 2018 at 11:02:57AM -0700, Greg Steuck wrote: > Dmitry, is there an easy way get at the VMs output? > > Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for > RT3290 chipset by James Hastings."). No lockup thus far. > Yeah the output below shows normal behaviour. Probably a similar issue to the one reported a while back, but for that case, vmd just went away (not spinning). -ml > OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct 2 10:37:13 PDT 2018 > syzkaller@ci-openbsd.syzkaller > :/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG > real mem = 17163079680 (16367MB) > avail mem = 16633610240 (15863MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries) > bios0: vendor Google version "Google" date 01/01/2011 > bios0: Google Google Compute Engine > acpi0 at bios0: rev 0 > acpi0: sleep states S3 S4 S5 > acpi0: tables DSDT FACP SSDT APIC WAET SRAT > acpi0: wakeup devices > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2300.55 MHz, 06-3f-00 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 990MHz > cpu1 at mainbus0: apid 2 (application processor) > cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 0, core 1, package 0 > cpu2 at mainbus0: apid 4 (application processor) > cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.87 MHz, 06-3f-00 > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 2, package 0 > cpu3 at mainbus0: apid 6 (application processor) > cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.89 MHz, 06-3f-00 > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu3: 256KB 64b/line 8-way L2 cache > cpu3: smt 0, core 3, package 0 > cpu4 at mainbus0: apid 1 (application processor) > cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 > cpu4: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu4: 256KB 64b/line 8-way L2 cache > cpu4: smt 1, core 0, package 0 > cpu5 at mainbus0: apid 3 (application processor) > cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 > cpu5: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu5: 256KB 64b/line 8-way L2 cache > cpu5: smt 1, core 1, package 0 > cpu6 at mainbus0: apid 5 (application processor) > cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.94 MHz, 06-3f-00 > cpu6: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu6: 256KB 64b/line 8-way L2 cache > cpu6: smt 1, core 2, package 0 > cpu7 at mainbus0: apid 7 (application processor) > cpu7: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.91 MHz, 06-3f-00 > cpu7: >
Re: vmd losing VMs
Dmitry, is there an easy way get at the VMs output? Here's dmesg from VMM_DEBUG-enabled kernel (built at "Add support for RT3290 chipset by James Hastings."). No lockup thus far. OpenBSD 6.4 (VMM_DEBUG) #0: Tue Oct 2 10:37:13 PDT 2018 syzkaller@ci-openbsd.syzkaller :/syzkaller/src/sys/arch/amd64/compile/VMM_DEBUG real mem = 17163079680 (16367MB) avail mem = 16633610240 (15863MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcf0 (20 entries) bios0: vendor Google version "Google" date 01/01/2011 bios0: Google Google Compute Engine acpi0 at bios0: rev 0 acpi0: sleep states S3 S4 S5 acpi0: tables DSDT FACP SSDT APIC WAET SRAT acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2300.55 MHz, 06-3f-00 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 990MHz cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 4 (application processor) cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.87 MHz, 06-3f-00 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 6 (application processor) cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.89 MHz, 06-3f-00 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 0, core 3, package 0 cpu4 at mainbus0: apid 1 (application processor) cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu4: 256KB 64b/line 8-way L2 cache cpu4: smt 1, core 0, package 0 cpu5 at mainbus0: apid 3 (application processor) cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.88 MHz, 06-3f-00 cpu5: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu5: 256KB 64b/line 8-way L2 cache cpu5: smt 1, core 1, package 0 cpu6 at mainbus0: apid 5 (application processor) cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.94 MHz, 06-3f-00 cpu6: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu6: 256KB 64b/line 8-way L2 cache cpu6: smt 1, core 2, package 0 cpu7 at mainbus0: apid 7 (application processor) cpu7: Intel(R) Xeon(R) CPU @ 2.30GHz, 2276.91 MHz, 06-3f-00 cpu7: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu7: 256KB 64b/line 8-way L2 cache cpu7: smt 1, core 3, package 0 ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins acpiprt0 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0: C1(@1 halt!) acpicpu1 at acpi0: C1(@1 halt!) acpicpu2 at acpi0: C1(@1 halt!) acpicpu3 at acpi0: C1(@1 halt!)
Re: vmd losing VMs
Ok thanks. I wonder if there is a tool that can log everything from the VM's console ... It would be useful to see what happens to the VM before it goes into zombie mode. Reyk On Tue, Oct 02, 2018 at 10:33:11AM -0700, Greg Steuck wrote: > I believe I was unable to ssh into them or get cu to elicit any characters. > I'll verify next time it happens. > > On Tue, Oct 2, 2018 at 10:20 AM Reyk Floeter wrote: > > > On Tue, Oct 02, 2018 at 10:10:41AM -0700, Greg Steuck wrote: > > > Naturally, bugs don't solve themselves :) Here's a log, it's not very > > > useful due to the lack of debugging symbols. Notice, that runaway vmds > > > don't die on their own, they just spin out of control. I'll do VMM_DEBUG > > > next. > > > > > > > "they just spin out of control" - maybe I've missed the previous > > details, but do you know what happens to these VMs? Are they stuck in > > ddb or in a reboot loop? > > > > Reyk > > > > > ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd > > > kern.nosuidcoredump: 1 -> 3 > > > ci-openbsd# sysctl kern.nosuidcoredump > > > kern.nosuidcoredump=3 > > > ci-openbsd# ps ax | grep vmd > > > 32653 ?? Isp 0:01.28 /usr/sbin/vmd > > > 89585 ?? Is 0:00.14 vmd: priv (vmd) > > > 50191 ?? Isp 0:01.88 vmd: control (vmd) > > > 33160 ?? Isp 0:08.55 vmd: vmm (vmd) > > > 52853 ?? Rp/1 280:12.56 vmd: ci-openbsd-main-1 (vmd) > > > 3238 ?? Rp/2 48:56.13 vmd: ci-openbsd-main-0 (vmd) > > > 44625 ?? Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd) > > > 42187 p5 R+p/0 0:00.00 grep vmd (ksh) > > > ci-openbsd# vmctl status > > >ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME > > > 239 44625 1512M181M ttyp2syzkaller ci-openbsd-main-2 > > > 1 - 1512M - -syzkaller syzkaller > > > ci-openbsd# kill 52853 > > > ci-openbsd# ps ax | grep vmd > > > 32653 ?? Ssp 0:01.28 /usr/sbin/vmd > > > 89585 ?? Is 0:00.14 vmd: priv (vmd) > > > 50191 ?? Ssp 0:01.88 vmd: control (vmd) > > > 33160 ?? Ssp 0:08.55 vmd: vmm (vmd) > > > 52853 ?? Rp/1 280:30.24 vmd: ci-openbsd-main-1 (vmd) > > > 3238 ?? Rp/2 49:14.27 vmd: ci-openbsd-main-0 (vmd) > > > 44625 ?? Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd) > > > 27783 p5 R+p/1 0:00.00 grep vmd (ksh) > > > ci-openbsd# ps ax | grep syz > > > 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b > > > /syzkaller/ramdisk > > > 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) > > > 957 ?? S 0:01.09 sshd: syzkaller@ttyp3 (sshd) > > > 28743 ?? I 18:00.14 syzkaller/current/bin/syz-manager -config > > > /syzkaller/managers/main/current/manager.cfg > > > 5869 ?? Ip 0:00.07 ssh -p 22 -i > > /syzkaller/managers/main/current/key > > > -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi > > > 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) > > > 77889 ?? S 0:00.17 sshd: syzkaller@ttyp4 (sshd) > > > 39218 p5 R+/00:00.00 grep syz > > > 50644 00- D 4:09.87 ./syz-ci -config ./config-openbsd.ci > > > 60603 00- Ip 0:00.05 tee syz-ci.log > > > ci-openbsd# kill 50644 28743 > > > ci-openbsd# ps ax | grep syz > > > 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b > > > /syzkaller/ramdisk > > > 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) > > > 957 ?? I 0:01.09 sshd: syzkaller@ttyp3 (sshd) > > > 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) > > > 77889 ?? S 0:00.18 sshd: syzkaller@ttyp4 (sshd) > > > 15816 p5 R+/10:00.00 grep syz > > > ci-openbsd# rcctl stop vmd > > > vmd(ok) > > > ci-openbsd# ps ax | grep vmd > > > 52853 ?? Rp/1 281:59.20 vmd: ci-openbsd-main-1 (vmd) > > > 3238 ?? Rp/2 50:42.87 vmd: ci-openbsd-main-0 (vmd) > > > 19166 p5 R+/00:00.00 grep vmd > > > ci-openbsd# kill -ABRT 52853 > > > ci-openbsd# ps ax | grep vmd > > > 3238 ?? Rp/2 52:06.55 vmd: ci-openbsd-main-0 (vmd) > > > 55423 p5 S+p 0:00.01 grep vmd > > > ci-openbsd# dmesg | tail > > > pms0 at pckbc0 (aux slot) > > > wsmouse0 at pms0 mux 0 > > > pcppi0 at isa0 port 0x61 > > > spkr0 at pcppi0 > > > vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) > > > vscsi0 at root > > > scsibus2 at vscsi0: 256 targets > > > softraid0 at root > > > scsibus3 at softraid0: 256 targets > > > root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b > > > ci-openbsd# ps ax | grep vmd > > > 3238 ?? Rp/2 52:06.88 vmd: ci-openbsd-main-0 (vmd) > > > 19175 p5 R+/00:00.00 grep vmd > > > ci-openbsd# kill 3238 > > > ci-openbsd# ps ax | grep vmd > > > 3238 ?? Rp/2 52:18.52 vmd: ci-openbsd-main-0 (vmd) > > > 27516 p5 R+/30:00.00 grep vmd > > > ci-openbsd# kill -ABRT 3238 > > > ci-openbsd# ps ax | grep vmd > > > 3238 ?? Rp/2 52:27.71 (vmd) > > > 93083 p5 R+/30:00.00 grep vmd > > > ci-openbsd# ps ax | grep vmd > > > 95984 p5 S+p 0:00.01 grep vmd > > > ci-openbsd# ls -l /var/crash/vmd > > > total 668864 > > > -rw--- 1 root wheel 200320568
Re: vmd losing VMs
I believe I was unable to ssh into them or get cu to elicit any characters. I'll verify next time it happens. On Tue, Oct 2, 2018 at 10:20 AM Reyk Floeter wrote: > On Tue, Oct 02, 2018 at 10:10:41AM -0700, Greg Steuck wrote: > > Naturally, bugs don't solve themselves :) Here's a log, it's not very > > useful due to the lack of debugging symbols. Notice, that runaway vmds > > don't die on their own, they just spin out of control. I'll do VMM_DEBUG > > next. > > > > "they just spin out of control" - maybe I've missed the previous > details, but do you know what happens to these VMs? Are they stuck in > ddb or in a reboot loop? > > Reyk > > > ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd > > kern.nosuidcoredump: 1 -> 3 > > ci-openbsd# sysctl kern.nosuidcoredump > > kern.nosuidcoredump=3 > > ci-openbsd# ps ax | grep vmd > > 32653 ?? Isp 0:01.28 /usr/sbin/vmd > > 89585 ?? Is 0:00.14 vmd: priv (vmd) > > 50191 ?? Isp 0:01.88 vmd: control (vmd) > > 33160 ?? Isp 0:08.55 vmd: vmm (vmd) > > 52853 ?? Rp/1 280:12.56 vmd: ci-openbsd-main-1 (vmd) > > 3238 ?? Rp/2 48:56.13 vmd: ci-openbsd-main-0 (vmd) > > 44625 ?? Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd) > > 42187 p5 R+p/0 0:00.00 grep vmd (ksh) > > ci-openbsd# vmctl status > >ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME > > 239 44625 1512M181M ttyp2syzkaller ci-openbsd-main-2 > > 1 - 1512M - -syzkaller syzkaller > > ci-openbsd# kill 52853 > > ci-openbsd# ps ax | grep vmd > > 32653 ?? Ssp 0:01.28 /usr/sbin/vmd > > 89585 ?? Is 0:00.14 vmd: priv (vmd) > > 50191 ?? Ssp 0:01.88 vmd: control (vmd) > > 33160 ?? Ssp 0:08.55 vmd: vmm (vmd) > > 52853 ?? Rp/1 280:30.24 vmd: ci-openbsd-main-1 (vmd) > > 3238 ?? Rp/2 49:14.27 vmd: ci-openbsd-main-0 (vmd) > > 44625 ?? Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd) > > 27783 p5 R+p/1 0:00.00 grep vmd (ksh) > > ci-openbsd# ps ax | grep syz > > 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b > > /syzkaller/ramdisk > > 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) > > 957 ?? S 0:01.09 sshd: syzkaller@ttyp3 (sshd) > > 28743 ?? I 18:00.14 syzkaller/current/bin/syz-manager -config > > /syzkaller/managers/main/current/manager.cfg > > 5869 ?? Ip 0:00.07 ssh -p 22 -i > /syzkaller/managers/main/current/key > > -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi > > 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) > > 77889 ?? S 0:00.17 sshd: syzkaller@ttyp4 (sshd) > > 39218 p5 R+/00:00.00 grep syz > > 50644 00- D 4:09.87 ./syz-ci -config ./config-openbsd.ci > > 60603 00- Ip 0:00.05 tee syz-ci.log > > ci-openbsd# kill 50644 28743 > > ci-openbsd# ps ax | grep syz > > 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b > > /syzkaller/ramdisk > > 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) > > 957 ?? I 0:01.09 sshd: syzkaller@ttyp3 (sshd) > > 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) > > 77889 ?? S 0:00.18 sshd: syzkaller@ttyp4 (sshd) > > 15816 p5 R+/10:00.00 grep syz > > ci-openbsd# rcctl stop vmd > > vmd(ok) > > ci-openbsd# ps ax | grep vmd > > 52853 ?? Rp/1 281:59.20 vmd: ci-openbsd-main-1 (vmd) > > 3238 ?? Rp/2 50:42.87 vmd: ci-openbsd-main-0 (vmd) > > 19166 p5 R+/00:00.00 grep vmd > > ci-openbsd# kill -ABRT 52853 > > ci-openbsd# ps ax | grep vmd > > 3238 ?? Rp/2 52:06.55 vmd: ci-openbsd-main-0 (vmd) > > 55423 p5 S+p 0:00.01 grep vmd > > ci-openbsd# dmesg | tail > > pms0 at pckbc0 (aux slot) > > wsmouse0 at pms0 mux 0 > > pcppi0 at isa0 port 0x61 > > spkr0 at pcppi0 > > vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) > > vscsi0 at root > > scsibus2 at vscsi0: 256 targets > > softraid0 at root > > scsibus3 at softraid0: 256 targets > > root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b > > ci-openbsd# ps ax | grep vmd > > 3238 ?? Rp/2 52:06.88 vmd: ci-openbsd-main-0 (vmd) > > 19175 p5 R+/00:00.00 grep vmd > > ci-openbsd# kill 3238 > > ci-openbsd# ps ax | grep vmd > > 3238 ?? Rp/2 52:18.52 vmd: ci-openbsd-main-0 (vmd) > > 27516 p5 R+/30:00.00 grep vmd > > ci-openbsd# kill -ABRT 3238 > > ci-openbsd# ps ax | grep vmd > > 3238 ?? Rp/2 52:27.71 (vmd) > > 93083 p5 R+/30:00.00 grep vmd > > ci-openbsd# ps ax | grep vmd > > 95984 p5 S+p 0:00.01 grep vmd > > ci-openbsd# ls -l /var/crash/vmd > > total 668864 > > -rw--- 1 root wheel 200320568 Oct 2 09:47 3238.core > > -rw--- 1 root wheel 141988032 Oct 2 09:46 52853.core > > ci-openbsd# gdb /usr/sb > > ci-openbsd# file /var/crash/vmd/52853.core > > /var/crash/vmd/52853.core: ELF 64-bit LSB core file x86-64, version 1 > > ci-openbsd# gdb /usr/sbin/vmd /var/crash/vmd/52853.core > > GNU gdb 6.3 > > Copyright 2004 Free Software Foundation, Inc. > > GDB is free software, covered by the GNU General Public License, and you > are > >
Re: vmd losing VMs
On Tue, Oct 02, 2018 at 10:10:41AM -0700, Greg Steuck wrote: > Naturally, bugs don't solve themselves :) Here's a log, it's not very > useful due to the lack of debugging symbols. Notice, that runaway vmds > don't die on their own, they just spin out of control. I'll do VMM_DEBUG > next. > "they just spin out of control" - maybe I've missed the previous details, but do you know what happens to these VMs? Are they stuck in ddb or in a reboot loop? Reyk > ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd > kern.nosuidcoredump: 1 -> 3 > ci-openbsd# sysctl kern.nosuidcoredump > kern.nosuidcoredump=3 > ci-openbsd# ps ax | grep vmd > 32653 ?? Isp 0:01.28 /usr/sbin/vmd > 89585 ?? Is 0:00.14 vmd: priv (vmd) > 50191 ?? Isp 0:01.88 vmd: control (vmd) > 33160 ?? Isp 0:08.55 vmd: vmm (vmd) > 52853 ?? Rp/1 280:12.56 vmd: ci-openbsd-main-1 (vmd) > 3238 ?? Rp/2 48:56.13 vmd: ci-openbsd-main-0 (vmd) > 44625 ?? Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd) > 42187 p5 R+p/0 0:00.00 grep vmd (ksh) > ci-openbsd# vmctl status >ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME > 239 44625 1512M181M ttyp2syzkaller ci-openbsd-main-2 > 1 - 1512M - -syzkaller syzkaller > ci-openbsd# kill 52853 > ci-openbsd# ps ax | grep vmd > 32653 ?? Ssp 0:01.28 /usr/sbin/vmd > 89585 ?? Is 0:00.14 vmd: priv (vmd) > 50191 ?? Ssp 0:01.88 vmd: control (vmd) > 33160 ?? Ssp 0:08.55 vmd: vmm (vmd) > 52853 ?? Rp/1 280:30.24 vmd: ci-openbsd-main-1 (vmd) > 3238 ?? Rp/2 49:14.27 vmd: ci-openbsd-main-0 (vmd) > 44625 ?? Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd) > 27783 p5 R+p/1 0:00.00 grep vmd (ksh) > ci-openbsd# ps ax | grep syz > 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b > /syzkaller/ramdisk > 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) > 957 ?? S 0:01.09 sshd: syzkaller@ttyp3 (sshd) > 28743 ?? I 18:00.14 syzkaller/current/bin/syz-manager -config > /syzkaller/managers/main/current/manager.cfg > 5869 ?? Ip 0:00.07 ssh -p 22 -i /syzkaller/managers/main/current/key > -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi > 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) > 77889 ?? S 0:00.17 sshd: syzkaller@ttyp4 (sshd) > 39218 p5 R+/00:00.00 grep syz > 50644 00- D 4:09.87 ./syz-ci -config ./config-openbsd.ci > 60603 00- Ip 0:00.05 tee syz-ci.log > ci-openbsd# kill 50644 28743 > ci-openbsd# ps ax | grep syz > 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b > /syzkaller/ramdisk > 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) > 957 ?? I 0:01.09 sshd: syzkaller@ttyp3 (sshd) > 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) > 77889 ?? S 0:00.18 sshd: syzkaller@ttyp4 (sshd) > 15816 p5 R+/10:00.00 grep syz > ci-openbsd# rcctl stop vmd > vmd(ok) > ci-openbsd# ps ax | grep vmd > 52853 ?? Rp/1 281:59.20 vmd: ci-openbsd-main-1 (vmd) > 3238 ?? Rp/2 50:42.87 vmd: ci-openbsd-main-0 (vmd) > 19166 p5 R+/00:00.00 grep vmd > ci-openbsd# kill -ABRT 52853 > ci-openbsd# ps ax | grep vmd > 3238 ?? Rp/2 52:06.55 vmd: ci-openbsd-main-0 (vmd) > 55423 p5 S+p 0:00.01 grep vmd > ci-openbsd# dmesg | tail > pms0 at pckbc0 (aux slot) > wsmouse0 at pms0 mux 0 > pcppi0 at isa0 port 0x61 > spkr0 at pcppi0 > vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) > vscsi0 at root > scsibus2 at vscsi0: 256 targets > softraid0 at root > scsibus3 at softraid0: 256 targets > root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b > ci-openbsd# ps ax | grep vmd > 3238 ?? Rp/2 52:06.88 vmd: ci-openbsd-main-0 (vmd) > 19175 p5 R+/00:00.00 grep vmd > ci-openbsd# kill 3238 > ci-openbsd# ps ax | grep vmd > 3238 ?? Rp/2 52:18.52 vmd: ci-openbsd-main-0 (vmd) > 27516 p5 R+/30:00.00 grep vmd > ci-openbsd# kill -ABRT 3238 > ci-openbsd# ps ax | grep vmd > 3238 ?? Rp/2 52:27.71 (vmd) > 93083 p5 R+/30:00.00 grep vmd > ci-openbsd# ps ax | grep vmd > 95984 p5 S+p 0:00.01 grep vmd > ci-openbsd# ls -l /var/crash/vmd > total 668864 > -rw--- 1 root wheel 200320568 Oct 2 09:47 3238.core > -rw--- 1 root wheel 141988032 Oct 2 09:46 52853.core > ci-openbsd# gdb /usr/sb > ci-openbsd# file /var/crash/vmd/52853.core > /var/crash/vmd/52853.core: ELF 64-bit LSB core file x86-64, version 1 > ci-openbsd# gdb /usr/sbin/vmd /var/crash/vmd/52853.core > GNU gdb 6.3 > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-unknown-openbsd6.4"...(no debugging > symbols found) > > Core was generated by `vmd'. > Program terminated with signal 6, Aborted. > Reading
vmd losing VMs
Naturally, bugs don't solve themselves :) Here's a log, it's not very useful due to the lack of debugging symbols. Notice, that runaway vmds don't die on their own, they just spin out of control. I'll do VMM_DEBUG next. ci-openbsd# sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd kern.nosuidcoredump: 1 -> 3 ci-openbsd# sysctl kern.nosuidcoredump kern.nosuidcoredump=3 ci-openbsd# ps ax | grep vmd 32653 ?? Isp 0:01.28 /usr/sbin/vmd 89585 ?? Is 0:00.14 vmd: priv (vmd) 50191 ?? Isp 0:01.88 vmd: control (vmd) 33160 ?? Isp 0:08.55 vmd: vmm (vmd) 52853 ?? Rp/1 280:12.56 vmd: ci-openbsd-main-1 (vmd) 3238 ?? Rp/2 48:56.13 vmd: ci-openbsd-main-0 (vmd) 44625 ?? Rp/02:54.05 vmd: ci-openbsd-main-2 (vmd) 42187 p5 R+p/0 0:00.00 grep vmd (ksh) ci-openbsd# vmctl status ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME 239 44625 1512M181M ttyp2syzkaller ci-openbsd-main-2 1 - 1512M - -syzkaller syzkaller ci-openbsd# kill 52853 ci-openbsd# ps ax | grep vmd 32653 ?? Ssp 0:01.28 /usr/sbin/vmd 89585 ?? Is 0:00.14 vmd: priv (vmd) 50191 ?? Ssp 0:01.88 vmd: control (vmd) 33160 ?? Ssp 0:08.55 vmd: vmm (vmd) 52853 ?? Rp/1 280:30.24 vmd: ci-openbsd-main-1 (vmd) 3238 ?? Rp/2 49:14.27 vmd: ci-openbsd-main-0 (vmd) 44625 ?? Rp/03:12.25 vmd: ci-openbsd-main-2 (vmd) 27783 p5 R+p/1 0:00.00 grep vmd (ksh) ci-openbsd# ps ax | grep syz 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b /syzkaller/ramdisk 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) 957 ?? S 0:01.09 sshd: syzkaller@ttyp3 (sshd) 28743 ?? I 18:00.14 syzkaller/current/bin/syz-manager -config /syzkaller/managers/main/current/manager.cfg 5869 ?? Ip 0:00.07 ssh -p 22 -i /syzkaller/managers/main/current/key -F /dev/null -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o Identi 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) 77889 ?? S 0:00.17 sshd: syzkaller@ttyp4 (sshd) 39218 p5 R+/00:00.00 grep syz 50644 00- D 4:09.87 ./syz-ci -config ./config-openbsd.ci 60603 00- Ip 0:00.05 tee syz-ci.log ci-openbsd# kill 50644 28743 ci-openbsd# ps ax | grep syz 50771 ?? Is 0:00.01 /sbin/mount_mfs -s 10G /dev/sd0b /syzkaller/ramdisk 70067 ?? Is 0:00.16 sshd: syzkaller [priv] (sshd) 957 ?? I 0:01.09 sshd: syzkaller@ttyp3 (sshd) 57895 ?? Is 0:00.08 sshd: syzkaller [priv] (sshd) 77889 ?? S 0:00.18 sshd: syzkaller@ttyp4 (sshd) 15816 p5 R+/10:00.00 grep syz ci-openbsd# rcctl stop vmd vmd(ok) ci-openbsd# ps ax | grep vmd 52853 ?? Rp/1 281:59.20 vmd: ci-openbsd-main-1 (vmd) 3238 ?? Rp/2 50:42.87 vmd: ci-openbsd-main-0 (vmd) 19166 p5 R+/00:00.00 grep vmd ci-openbsd# kill -ABRT 52853 ci-openbsd# ps ax | grep vmd 3238 ?? Rp/2 52:06.55 vmd: ci-openbsd-main-0 (vmd) 55423 p5 S+p 0:00.01 grep vmd ci-openbsd# dmesg | tail pms0 at pckbc0 (aux slot) wsmouse0 at pms0 mux 0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (dd61083aafe9fd0b.a) swap on sd0b dump on sd0b ci-openbsd# ps ax | grep vmd 3238 ?? Rp/2 52:06.88 vmd: ci-openbsd-main-0 (vmd) 19175 p5 R+/00:00.00 grep vmd ci-openbsd# kill 3238 ci-openbsd# ps ax | grep vmd 3238 ?? Rp/2 52:18.52 vmd: ci-openbsd-main-0 (vmd) 27516 p5 R+/30:00.00 grep vmd ci-openbsd# kill -ABRT 3238 ci-openbsd# ps ax | grep vmd 3238 ?? Rp/2 52:27.71 (vmd) 93083 p5 R+/30:00.00 grep vmd ci-openbsd# ps ax | grep vmd 95984 p5 S+p 0:00.01 grep vmd ci-openbsd# ls -l /var/crash/vmd total 668864 -rw--- 1 root wheel 200320568 Oct 2 09:47 3238.core -rw--- 1 root wheel 141988032 Oct 2 09:46 52853.core ci-openbsd# gdb /usr/sb ci-openbsd# file /var/crash/vmd/52853.core /var/crash/vmd/52853.core: ELF 64-bit LSB core file x86-64, version 1 ci-openbsd# gdb /usr/sbin/vmd /var/crash/vmd/52853.core GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-unknown-openbsd6.4"...(no debugging symbols found) Core was generated by `vmd'. Program terminated with signal 6, Aborted. Reading symbols from /usr/lib/libpthread.so.25.1...done. Loaded symbols for /usr/lib/libpthread.so.25.1 Loaded symbols for /usr/sbin/vmd Reading symbols from /usr/lib/libutil.so.13.0...done. Loaded symbols for /usr/lib/libutil.so.13.0 Symbols already loaded for /usr/lib/libpthread.so.25.1 Reading symbols from /usr/lib/libevent.so.4.1...done. Loaded symbols for /usr/lib/libevent.so.4.1 Reading symbols from /usr/lib/libc.so.92.5...done. Loaded symbols for
Re: vmd losing VMs
On Mon, Oct 01, 2018 at 10:16:24PM -0700, Greg Steuck wrote: > Thanks Mike. > > I've upgraded from Sep 27th to Sep 29th snapshot and so far I haven't seen > the problem with: > > OpenBSD 6.4-beta (GENERIC.MP) #336: Sat Sep 29 08:13:15 MDT 2018 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > pd@ is saying he's not responsible for the fix, but maybe something by reyk@ > is? > > I will apply the debugging tools should the problem recur. > > BTW, the memory copy-pasto fix is working well. I was prevented from > running 4x2G VMs ;) > > Thanks > Greg > -- > nest.cx is Gmail hosted, use PGP for anything private. Key: > http://goo.gl/6dMsr > Fingerprint: 5E2B 2D0E 1E03 2046 BEC3 4D50 0B15 42BD 8DF5 A1B0 Thanks. Please let me know if you see any other problems. -ml
Re: vmd losing VMs
Thanks Mike. I've upgraded from Sep 27th to Sep 29th snapshot and so far I haven't seen the problem with: OpenBSD 6.4-beta (GENERIC.MP) #336: Sat Sep 29 08:13:15 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP pd@ is saying he's not responsible for the fix, but maybe something by reyk@ is? I will apply the debugging tools should the problem recur. BTW, the memory copy-pasto fix is working well. I was prevented from running 4x2G VMs ;) Thanks Greg -- nest.cx is Gmail hosted, use PGP for anything private. Key: http://goo.gl/6dMsr Fingerprint: 5E2B 2D0E 1E03 2046 BEC3 4D50 0B15 42BD 8DF5 A1B0
Re: vmd losing VMs
On Fri, Sep 28, 2018 at 05:01:27PM -0700, Greg Steuck wrote: > I've been running syzkaller for about a day now. It launches/kills VMs all > the time. Somewhere along the way vmd seems to have lost track of one of > its VMs. Notice how syzkaller ci-openbsd-main-1 is conspicuously missing > from vmctl status even though there's a process chewing the CPU for it. > Some of the data may not be perfectly aligned because syzkaller is still > working, so don't be alarmed that some ids don't quite line up. The > important part is the runaway PID 49113. > > All of syzkaller machinery is running as syzkaller user, so it shouldn't be > messing with anything as root. In fact, the whole machine setup is > automated: > https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh > What you see there is literally what's running (modulo missing syzkaller > config and everything that syzkaller does as an ordinary user). > > I'll keep this system limping along should any debug ideas arise. > > OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018 > > ci-openbsd$ vmctl status >ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME > 236 40918 12.0G335M ttyp0syzkaller ci-openbsd-main-2 > 235 85341 12.0G398M ttyp2syzkaller ci-openbsd-main-0 > 1 - 1512M - -syzkaller syzkaller > 1. Are you getting any vmd cores? * sysctl kern.nosuidcoredump=3 && mkdir /var/crash/vmd * kill all vmd and restart the test. Not sure if that sysctl can be done after boot or if it needs to be in /etc/sysctl.conf 2. building a host kernel with VMM_DEBUG will be helpful here to see why VMs are disappearing. 3. You're running this nested in GCP? Do you know what VMX features they expose to guests? Perhaps there is an assumption being made in vmm that we have a certain feature not being exposed by the underlying GCP hypervisor (although I'm pretty sure that's not the case, might be good to check - VMM_DEBUG will tell us this). -ml > ci-openbsd$ dmesg > OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 32195465216 (30703MB) > avail mem = 31210467328 (29764MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcc0 (22 entries) > bios0: vendor Google version "Google" date 01/01/2011 > bios0: Google Google Compute Engine > acpi0 at bios0: rev 0 > acpi0: sleep states S3 S4 S5 > acpi0: tables DSDT FACP SSDT APIC WAET SRAT > acpi0: wakeup devices > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2070.62 MHz, 06-3f-00 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 1000MHz > cpu1 at mainbus0: apid 2 (application processor) > cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 0, core 1, package 0 > cpu2 at mainbus0: apid 4 (application processor) > cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00 > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 2, package 0 > cpu3 at mainbus0: apid 6 (application processor) > cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00 > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN > cpu3: 256KB 64b/line 8-way L2 cache > cpu3: smt 0, core 3, package 0 > cpu4 at mainbus0: apid 1 (application processor) > cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00 > cpu4: >
Re: vmd losing VMs
Another one bit the dust, ci-openbsd-main-0 this time. ci-openbsd$ vmctl status ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME 390 35277 12.0G474M ttyp2syzkaller ci-openbsd-main-2 1 - 1512M - -syzkaller syzkaller ci-openbsd$ ps axu | grep vmd _vmd 35277 109.7 0.3 2100040 98340 ?? Rp/2 9:49AM 12:56.84 vmd: ci-openbsd-main-2 (vmd) _vmd 49113 96.5 0.2 2099984 47868 ?? Rp/3 Fri07AM 1572:15.59 vmd: ci-openbsd-main-1 (vmd) _vmd 94581 97.4 0.2 2099944 48344 ?? Rp/1 8:31AM 92:02.56 vmd: ci-openbsd-main-0 (vmd) _vmd 29877 0.0 0.0 1364 1904 ?? Ssp Thu01PM0:02.65 vmd: control (vmd) _vmd 38711 0.0 0.0 1432 2028 ?? Ssp Thu01PM0:37.50 vmd: vmm (vmd) root 76140 0.0 0.0 1156 1684 ?? IsThu01PM0:00.24 vmd: priv (vmd) root 23839 0.0 0.0 1624 2004 ?? Ssp Thu01PM0:01.97 /usr/sbin/vmd syzkalle 51031 0.0 0.0 284 1216 p3 S+p 10:04AM0:00.01 grep vmd ci-openbsd$ netstat -rn Routing tables Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default10.128.0.1 UGS9 4107 - 8 vio0 224/4 127.0.0.1 URS00 32768 8 lo0 10.128.0.1 42:01:0a:80:00:01 UHLch 1 2998 - 7 vio0 10.128.0.1/32 10.128.0.45UCS10 - 8 vio0 10.128.0.4542:01:0a:80:00:2d UHLl 0 2859 - 1 vio0 10.128.0.45/32 10.128.0.45UCn00 - 4 vio0 100.64.134.2/31100.64.134.2 UCn03 - 4 tap1 100.64.134.2 fe:e1:ba:d4:93:1e UHLl 00 - 1 tap1 100.65.132.2/31100.65.132.2 UCn03 - 4 tap0 100.65.132.2 fe:e1:ba:dc:a6:c5 UHLl 00 - 1 tap0 100.65.134.2/31100.65.134.2 UCn13 - 4 tap2 100.65.134.2 fe:e1:ba:de:95:db UHLl 04 - 1 tap2 100.65.134.3 fe:e1:bb:d1:c8:39 UHLc 2 10 - 3 tap2 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UHhl 1 4159 32768 1 lo0 ci-openbsd$ grep vmd /var/log/daemon | tail -50 Sep 29 03:49:32 ci-openbsd vmd[23839]: config_setvm: failed to start vm ci-openbsd-main-2 Sep 29 03:50:10 ci-openbsd vmd[23839]: ci-openbsd-main-2: user 1000 cpu limit reached Sep 29 03:50:10 ci-openbsd vmd[23839]: config_setvm: failed to start vm ci-openbsd-main-2 Sep 29 03:50:41 ci-openbsd vmd[23839]: ci-openbsd-main-2: user 1000 cpu limit reached Sep 29 03:50:41 ci-openbsd vmd[23839]: config_setvm: failed to start vm ci-openbsd-main-2 Sep 29 03:50:53 ci-openbsd vmd[72457]: ci-openbsd-main-test-0: vcpu_deassert_pic_irq: can't deassert INTR for vm_id 357, vcpu_id 0 Sep 29 03:50:53 ci-openbsd vmd[72457]: vcpu_run_loop: vm 357 / vcpu 0 run ioctl failed: No such file or directory Sep 29 03:50:53 ci-openbsd vmd[6507]: ci-openbsd-main-test-1: vcpu_assert_pic_irq: can't assert INTR Sep 29 03:50:57 ci-openbsd vmd[90226]: ci-openbsd-main-0: vcpu_assert_pic_irq: can't assert INTR Sep 29 03:51:15 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 366 successfully, tty /dev/ttyp0 Sep 29 03:51:15 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 367 successfully, tty /dev/ttyp2 Sep 29 04:52:51 ci-openbsd vmd[37572]: ci-openbsd-main-2: can't clear INTR: No such file or directory Sep 29 04:53:19 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 368 successfully, tty /dev/ttyp0 Sep 29 04:53:20 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 369 successfully, tty /dev/ttyp2 Sep 29 05:15:29 ci-openbsd vmd[59496]: ci-openbsd-main-2: can't clear INTR: No such file or directory Sep 29 05:16:39 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 370 successfully, tty /dev/ttyp0 Sep 29 05:16:40 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 371 successfully, tty /dev/ttyp2 Sep 29 05:18:01 ci-openbsd vmd[11674]: ci-openbsd-main-2: can't set INTR Sep 29 05:18:01 ci-openbsd vmd[11674]: ci-openbsd-main-2: can't set INTR: No such file or directory Sep 29 05:18:14 ci-openbsd vmd[36211]: ci-openbsd-main-0: vcpu_assert_pic_irq: can't assert INTR Sep 29 05:18:16 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 372 successfully, tty /dev/ttyp0 Sep 29 05:18:23 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 373 successfully, tty /dev/ttyp2 Sep 29 05:22:28 ci-openbsd vmd[43464]: ci-openbsd-main-2: vcpu_deassert_pic_irq: can't deassert INTR for vm_id 365, vcpu_id 0 Sep 29 05:22:39 ci-openbsd vmd[23839]: ci-openbsd-main-2: started vm 374 successfully, tty /dev/ttyp0 Sep 29 05:23:31 ci-openbsd vmd[11677]: ci-openbsd-main-0: can't set INTR: No such file or directory Sep 29 05:23:41 ci-openbsd vmd[23839]: ci-openbsd-main-0: started vm 375 successfully, tty /dev/ttyp2 Sep 29 05:25:06 ci-openbsd vmd[43334]: ci-openbsd-main-2:
vmd losing VMs
I've been running syzkaller for about a day now. It launches/kills VMs all the time. Somewhere along the way vmd seems to have lost track of one of its VMs. Notice how syzkaller ci-openbsd-main-1 is conspicuously missing from vmctl status even though there's a process chewing the CPU for it. Some of the data may not be perfectly aligned because syzkaller is still working, so don't be alarmed that some ids don't quite line up. The important part is the runaway PID 49113. All of syzkaller machinery is running as syzkaller user, so it shouldn't be messing with anything as root. In fact, the whole machine setup is automated: https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh What you see there is literally what's running (modulo missing syzkaller config and everything that syzkaller does as an ordinary user). I'll keep this system limping along should any debug ideas arise. OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018 ci-openbsd$ vmctl status ID PID VCPUS MAXMEM CURMEM TTYOWNER NAME 236 40918 12.0G335M ttyp0syzkaller ci-openbsd-main-2 235 85341 12.0G398M ttyp2syzkaller ci-openbsd-main-0 1 - 1512M - -syzkaller syzkaller ci-openbsd$ dmesg OpenBSD 6.4-beta (GENERIC.MP) #329: Thu Sep 27 10:15:21 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 32195465216 (30703MB) avail mem = 31210467328 (29764MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbcc0 (22 entries) bios0: vendor Google version "Google" date 01/01/2011 bios0: Google Google Compute Engine acpi0 at bios0: rev 0 acpi0: sleep states S3 S4 S5 acpi0: tables DSDT FACP SSDT APIC WAET SRAT acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU @ 2.30GHz, 2070.62 MHz, 06-3f-00 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 1000MHz cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 4 (application processor) cpu2: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 6 (application processor) cpu3: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.88 MHz, 06-3f-00 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 0, core 3, package 0 cpu4 at mainbus0: apid 1 (application processor) cpu4: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.89 MHz, 06-3f-00 cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu4: 256KB 64b/line 8-way L2 cache cpu4: smt 1, core 0, package 0 cpu5 at mainbus0: apid 3 (application processor) cpu5: Intel(R) Xeon(R) CPU @ 2.30GHz, 2299.92 MHz, 06-3f-00 cpu5: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,XSAVEOPT,MELTDOWN cpu5: 256KB 64b/line 8-way L2 cache cpu5: smt 1, core 1, package 0 cpu6 at mainbus0: apid 5 (application processor) cpu6: Intel(R) Xeon(R) CPU @ 2.30GHz,