On 31/05/17(Wed) 22:43, Gregor Best wrote: > Hi people, > > with a fresh snapshot (according to the dates on the FTP server from > today), and a kernel built from CVS from about 20:15 UTC+1, vmd's child > processes reproducibly enter an infinite loop relatively early after > booting an Alpine Linux VM. The problem is not reproducible when booting > my /bsd.rd.
Thanks for reporting the problem, it's certainly related to the new condvard. I reverted librthread to the old implementation for the moment. > > The only local changes to the kernel are the workaround for PS2 AUX > probing I sent to tech@ a while back and a tweak to make the console use > the full area of the EFI frame buffer. > > The following steps reproduce the problem on my laptop: > > 1: Set up a VM like this, the iso is from > > https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-virt-3.6.0-x86_64.iso > > vm "alpine" { > memory 1024M > disk "/path/to/alpine/iso" > local interface > disable > } > > 2: Start vmd, and start the VM > 3: Open a console on the VM > 4: Hit return a bunch of times to make the boot progress, then log in as > root > 5: Run something simple like `ls' or `ip a s' > 6: After a few tries, the VM locks up and the CPU usage of the child > process for that VM goes to 100% > > Some times, touching the network interface by doing > > ip l set eth0 up > > seems to be necessary to trigger the hang, most of the time the hang > happens without. > > The hang does not happen when I leave the VM idling at the login prompt, > or if the only device the VM is configured with is a single disk (and no > network device). > > When ktrace'ing the child process before the lockup, that doesn't > happen, which is why I've added "Race condition (?)" to the subject. > > The VM can be stopped and (re)started with vmctl without problems, > detaching and reattaching the console works as well. > > Starting a ktrace after the lockup yields the following: > > 38688 vmd STRU struct timespec { 0.007390000 } > 38688 vmd STRU struct kevent { ident=9, filter=EVFILT_READ, > flags=0x1<EV_ADD>, fflags=0<>, data=12, udata=0xba8102a7638 } > 38688 vmd RET kevent 1 > 38688 vmd CALL clock_gettime(CLOCK_MONOTONIC,0xbaa59fedee0) > 38688 vmd STRU struct timespec { 7237.431970804 } > 38688 vmd RET clock_gettime 0 > 38688 vmd CALL kevent(0,0xbaa36b96000,0,0xbaa36b8e000,64,0xbaa59fedeb0) > 38688 vmd STRU struct timespec { 0.007341000 } > 38688 vmd STRU struct kevent { ident=9, filter=EVFILT_READ, > flags=0x1<EV_ADD>, fflags=0<>, data=12, udata=0xba8102a7638 } > 38688 vmd RET kevent 1 > 38688 vmd CALL clock_gettime(CLOCK_MONOTONIC,0xbaa59fedee0) > 38688 vmd STRU struct timespec { 7237.432007661 } > 38688 vmd RET clock_gettime 0 > [ ... remainder omitted ... ] > > The ktrace.out file grows by about 100 MB/s and this sequence is the > only content. > > GDB yields the following trace on the child process when it's in the > locked up state: > > (gdb) info threads > 3 thread 229483 0x00001f302beba75a in futex () at {standard input}:5 > * 2 thread 138274 0x00001f302be57ada in kevent () at {standard input}:5 > 1 thread 284521 0x00001f302beba75a in futex () at {standard input}:5 > (gdb) thread 1 > [Switching to thread 1 (thread 284521)]#0 0x00001f302beba75a in futex () at > {standard input}:5 > 5 {standard input}: No such file or directory. > in {standard input} > (gdb) bt > #0 0x00001f302beba75a in futex () at {standard input}:5 > #1 0x00001f2f9f30f8fb in _rthread_cond_timedwait (cond=0x1f2ff23fb1e0, > mutexp=0x1f2d469adea0, abs=0x0) at synch.h:43 > #2 0x00001f2d4670bff1 in run_vm (child_disks=0x1f2fcf57d51c, > child_taps=0x7f7fffff97f0, vmc=0x1f2fcf57d000, vrs=0x7f7fffff9610) at vm.c:825 > #3 0x00001f2d4670b18b in start_vm (vm=0x1f2fcf57d000, fd=12) at vm.c:334 > #4 0x00001f2d4670aacc in vmm_start_vm (imsg=0x7f7fffff9970, > id=0x7f7fffff9914) at vmm.c:545 > #5 0x00001f2d46709def in vmm_dispatch_parent (fd=3, p=0x1f2d4692d1a0, > imsg=0x7f7fffff9970) at vmm.c:135 > #6 0x00001f2d46707fee in proc_dispatch (fd=3, event=2, arg=0x1f2fdef2c000) > at proc.c:652 > #7 0x00001f301c04c2b8 in event_base_loop (base=0x1f2f7f196800, > flags=Variable "flags" is not available. > ) at /usr/src/lib/libevent/event.c:350 > #8 0x00001f2d46707e52 in proc_run (ps=0x1f2ff23fa000, p=0x1f2d4692d0a0, > procs=0x1f2d4692d1a0, nproc=1, run=0x1f2d46709b8b <vmm_run>, arg=0x0) at > proc.c:594 > #9 0x00001f2d46709b89 in vmm (ps=0x1f2ff23fa000, p=0x1f2d4692d0a0) at > vmm.c:74 > #10 0x00001f2d46706ed9 in proc_init (ps=0x1f2ff23fa000, procs=0x1f2d4692d020, > nproc=3, argc=6, argv=0x7f7fffff9c58, proc_id=PROC_VMM) at proc.c:249 > #11 0x00001f2d4670227d in main (argc=0, argv=0x7f7fffff9c58) at vmd.c:482 > (gdb) thread 2 > [Switching to thread 2 (thread 138274)]#0 0x00001f302be57ada in kevent () at > {standard input}:5 > 5 {standard input}: No such file or directory. > in {standard input} > (gdb) bt > #0 0x00001f302be57ada in kevent () at {standard input}:5 > #1 0x00001f301c046ac0 in kq_dispatch (base=Variable "base" is not available. > ) at /usr/src/lib/libevent/kqueue.c:193 > #2 0x00001f301c04c1e2 in event_base_loop (base=0x1f2ff23ff400, > flags=Variable "flags" is not available. > ) at /usr/src/lib/libevent/event.c:493 > #3 0x00001f2d4670c18a in event_thread (arg=0x7f7fffff90bf) at vm.c:881 > #4 0x00001f2f9f31183e in _rthread_start (v=Variable "v" is not available. > ) at /usr/src/lib/librthread/rthread.c:114 > #5 0x00001f302be698bb in __tfork_thread () at > /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 > #6 0x0000000000000000 in ?? () > (gdb) thread 3 > [Switching to thread 3 (thread 229483)]#0 0x00001f302beba75a in futex () at > {standard input}:5 > 5 {standard input}: No such file or directory. > in {standard input} > (gdb) bt > #0 0x00001f302beba75a in futex () at {standard input}:5 > #1 0x00001f2f9f30f8fb in _rthread_cond_timedwait (cond=0x1f2f497ed410, > mutexp=0x1f2d469adec0, abs=0x0) at synch.h:43 > #2 0x00001f2d4670c284 in vcpu_run_loop (arg=0x1f2f994b3760) at vm.c:926 > #3 0x00001f2f9f31183e in _rthread_start (v=Variable "v" is not available. > ) at /usr/src/lib/librthread/rthread.c:114 > #4 0x00001f302be698bb in __tfork_thread () at > /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 > #5 0x0000000000000000 in ?? () > > A dmesg follows after my signature. > > What can I do to debug this further? > > -- > Gregor > > OpenBSD 6.1-current (GENERIC.MP) #77: Wed May 31 20:07:33 CEST 2017 > [email protected]:/usr/obj/sys/arch/amd64/compile/GENERIC.MP > real mem = 17078013952 (16286MB) > avail mem = 16554565632 (15787MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x87ed7000 (44 entries) > bios0: vendor American Megatrends Inc. version "5.11" date 08/29/2016 > bios0: Notebook N24_25JU > acpi0 at bios0: rev 2 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP APIC FPDT FIDT MCFG HPET SSDT SSDT DBGP DBG2 SSDT > UEFI SSDT BGRT DMAR TPM2 ASF! > acpi0: wakeup devices PEGP(S4) PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) > PXSX(S4) RP17(S4) PXSX(S4) RP18(S4) PXSX(S4) RP19(S4) PXSX(S4) RP20(S4) > PXSX(S4) RP01(S4) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: TSC frequency 2400000000 Hz > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges > cpu0: apic clock running at 23MHz > cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE > cpu1 at mainbus0: apid 2 (application processor) > cpu1: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 0, core 1, package 0 > cpu2 at mainbus0: apid 1 (application processor) > cpu2: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 1, core 0, package 0 > cpu3 at mainbus0: apid 3 (application processor) > cpu3: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT > cpu3: 256KB 64b/line 8-way L2 cache > cpu3: smt 1, core 1, package 0 > ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 120 pins > acpimcfg0 at acpi0 addr 0xe0000000, bus 0-255 > acpihpet0 at acpi0: 23999999 Hz > acpiprt0 at acpi0: bus 0 (PCI0) > acpiprt1 at acpi0: bus -1 (PEG0) > acpiprt2 at acpi0: bus -1 (PEG1) > acpiprt3 at acpi0: bus -1 (PEG2) > acpiprt4 at acpi0: bus -1 (RP17) > acpiprt5 at acpi0: bus -1 (RP18) > acpiprt6 at acpi0: bus -1 (RP19) > acpiprt7 at acpi0: bus -1 (RP20) > acpiprt8 at acpi0: bus 1 (RP01) > acpiprt9 at acpi0: bus 2 (RP02) > acpiprt10 at acpi0: bus -1 (RP03) > acpiprt11 at acpi0: bus -1 (RP04) > acpiprt12 at acpi0: bus -1 (RP05) > acpiprt13 at acpi0: bus -1 (RP06) > acpiprt14 at acpi0: bus -1 (RP07) > acpiprt15 at acpi0: bus -1 (RP08) > acpiprt16 at acpi0: bus -1 (RP09) > acpiprt17 at acpi0: bus -1 (RP10) > acpiprt18 at acpi0: bus -1 (RP11) > acpiprt19 at acpi0: bus -1 (RP12) > acpiprt20 at acpi0: bus -1 (RP13) > acpiprt21 at acpi0: bus -1 (RP14) > acpiprt22 at acpi0: bus -1 (RP15) > acpiprt23 at acpi0: bus -1 (RP16) > acpiec0 at acpi0 > acpicpu0 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), > C1(1000@1 mwait.1), PSS > acpicpu1 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), > C1(1000@1 mwait.1), PSS > acpicpu2 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), > C1(1000@1 mwait.1), PSS > acpicpu3 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), > C1(1000@1 mwait.1), PSS > acpipwrres0 at acpi0: PG00, resource for PEG0 > acpipwrres1 at acpi0: PG01, resource for PEG1 > acpipwrres2 at acpi0: PG02, resource for PEG2 > acpitz0 at acpi0: critical temperature is 120 degC > "INT3F0D" at acpi0 not configured > "MSFT0001" at acpi0 not configured > "SYN1218" at acpi0 not configured > "INT33A1" at acpi0 not configured > "MSFT0101" at acpi0 not configured > "PNPC000" at acpi0 not configured > acpibtn0 at acpi0: PWRB > acpibtn1 at acpi0: SLPB > acpibtn2 at acpi0: LID0 > acpiac0 at acpi0: AC unit online > acpibat0 at acpi0: BAT0 model "BAT" serial 0001 type LION oem "Notebook" > "PNP0C14" at acpi0 not configured > "INT340E" at acpi0 not configured > acpivideo0 at acpi0: GFX0 > acpivout0 at acpivideo0: DD1F > cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2301, 2300, 2200, 2100, 2000, > 1800, 1700, 1500, 1400, 1300, 1100, 1000, 800, 700, 500, 400 MHz > pci0 at mainbus0 bus 0 > pchb0 at pci0 dev 0 function 0 "Intel Core 6G Host" rev 0x08 > "Intel HD Graphics 520" rev 0x07 at pci0 dev 2 function 0 not configured > xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x21: msi > usb0 at xhci0: USB revision 3.0 > uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 > addr 1 > "Intel 100 Series MEI" rev 0x21 at pci0 dev 22 function 0 not configured > ahci0 at pci0 dev 23 function 0 "Intel 100 Series AHCI" rev 0x21: msi, AHCI > 1.3.1 > ahci0: port 0: 6.0Gb/s > scsibus1 at ahci0: 32 targets > sd0 at scsibus1 targ 0 lun 0: <ATA, CT240BX200SSD1, MU02> SCSI3 0/direct > fixed naa.500a0751f01c9d7a > sd0: 228936MB, 512 bytes/sector, 468862128 sectors, thin > ppb0 at pci0 dev 28 function 0 "Intel 100 Series PCIE" rev 0xf1: msi > pci1 at ppb0 bus 1 > rtsx0 at pci1 dev 0 function 0 "Realtek RTL8411B Card Reader" rev 0x01: msi > sdmmc0 at rtsx0: 4-bit > re0 at pci1 dev 0 function 1 "Realtek 8168" rev 0x12: RTL8411B (0x5c80), msi, > address 80:fa:5b:33:76:b3 > rgephy0 at re0 phy 7: RTL8251 PHY, rev. 0 > ppb1 at pci0 dev 28 function 1 "Intel 100 Series PCIE" rev 0xf1: msi > pci2 at ppb1 bus 2 > iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless AC 8260" rev 0x3a, msi > pcib0 at pci0 dev 31 function 0 "Intel 100 Series LPC" rev 0x21 > "Intel 100 Series PMC" rev 0x21 at pci0 dev 31 function 2 not configured > azalia0 at pci0 dev 31 function 3 "Intel 100 Series HD Audio" rev 0x21: msi > azalia0: codecs: Realtek ALC269, Intel/0x2809, using Realtek ALC269 > audio0 at azalia0 > ichiic0 at pci0 dev 31 function 4 "Intel 100 Series SMBus" rev 0x21: apic 2 > int 16 > iic0 at ichiic0 > isa0 at pcib0 > isadma0 at isa0 > pckbc0 at isa0 port 0x60/5 irq 1 irq 12 > pckbd0 at pckbc0 (kbd slot) > wskbd0 at pckbd0: console keyboard > pms0 at pckbc0 (aux slot) > wsmouse0 at pms0 mux 0 > pms0: Synaptics touchpad, firmware 8.2 > pcppi0 at isa0 port 0x61 > spkr0 at pcppi0 > vmm0 at mainbus0: VMX/EPT > efifb0 at mainbus0: 1920x1080, 32bpp > wsdisplay0 at efifb0 mux 1: console (std, vt100 emulation), using wskbd0 > wsdisplay0: screen 1-5 added (std, vt100 emulation) > ugen0 at uhub0 port 5 "Intel Bluetooth" rev 2.00/0.01 addr 2 > uvideo0 at uhub0 port 6 configuration 1 interface 0 "Generic Chicony USB 2.0 > Camera" rev 2.00/32.07 addr 3 > video0 at uvideo0 > vscsi0 at root > scsibus2 at vscsi0: 256 targets > softraid0 at root > scsibus3 at softraid0: 256 targets > sd1 at scsibus3 targ 1 lun 0: <OPENBSD, SR CRYPTO, 006> SCSI2 0/direct fixed > sd1: 228896MB, 512 bytes/sector, 468779647 sectors > root on sd1a (927460f40d363126.a) swap on sd1b dump on sd1b > iwm0: hw rev 0x200, fw ver 16.242414.0, address e4:a4:71:e1:45:79 >
