On 31/05/17(Wed) 22:43, Gregor Best wrote:
> Hi people,
> 
> with a fresh snapshot (according to the dates on the FTP server from
> today), and a kernel built from CVS from about 20:15 UTC+1, vmd's child
> processes reproducibly enter an infinite loop relatively early after
> booting an Alpine Linux VM. The problem is not reproducible when booting
> my /bsd.rd.

Thanks for reporting the problem, it's certainly related to the new
condvard.  I reverted librthread to the old implementation for the
moment.

> 
> The only local changes to the kernel are the workaround for PS2 AUX
> probing I sent to tech@ a while back and a tweak to make the console use
> the full area of the EFI frame buffer.
> 
> The following steps reproduce the problem on my laptop:
> 
> 1: Set up a VM like this, the iso is from
>    
> https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-virt-3.6.0-x86_64.iso
> 
>       vm "alpine" {
>               memory 1024M
>               disk "/path/to/alpine/iso"
>               local interface
>               disable
>       }
> 
> 2: Start vmd, and start the VM
> 3: Open a console on the VM
> 4: Hit return a bunch of times to make the boot progress, then log in as
>    root
> 5: Run something simple like `ls' or `ip a s'
> 6: After a few tries, the VM locks up and the CPU usage of the child
>    process for that VM goes to 100%
> 
> Some times, touching the network interface by doing
> 
>       ip l set eth0 up
> 
> seems to be necessary to trigger the hang, most of the time the hang
> happens without.
> 
> The hang does not happen when I leave the VM idling at the login prompt,
> or if the only device the VM is configured with is a single disk (and no
> network device).
> 
> When ktrace'ing the child process before the lockup, that doesn't
> happen, which is why I've added "Race condition (?)" to the subject.
> 
> The VM can be stopped and (re)started with vmctl without problems,
> detaching and reattaching the console works as well.
> 
> Starting a ktrace after the lockup yields the following:
> 
> 38688 vmd      STRU  struct timespec { 0.007390000 }
> 38688 vmd      STRU  struct kevent { ident=9, filter=EVFILT_READ, 
> flags=0x1<EV_ADD>, fflags=0<>, data=12, udata=0xba8102a7638 }
> 38688 vmd      RET   kevent 1
> 38688 vmd      CALL  clock_gettime(CLOCK_MONOTONIC,0xbaa59fedee0)
> 38688 vmd      STRU  struct timespec { 7237.431970804 }
> 38688 vmd      RET   clock_gettime 0
> 38688 vmd      CALL  kevent(0,0xbaa36b96000,0,0xbaa36b8e000,64,0xbaa59fedeb0)
> 38688 vmd      STRU  struct timespec { 0.007341000 }
> 38688 vmd      STRU  struct kevent { ident=9, filter=EVFILT_READ, 
> flags=0x1<EV_ADD>, fflags=0<>, data=12, udata=0xba8102a7638 }
> 38688 vmd      RET   kevent 1
> 38688 vmd      CALL  clock_gettime(CLOCK_MONOTONIC,0xbaa59fedee0)
> 38688 vmd      STRU  struct timespec { 7237.432007661 }
> 38688 vmd      RET   clock_gettime 0
> [ ... remainder omitted ... ]
> 
> The ktrace.out file grows by about 100 MB/s and this sequence is the
> only content.
> 
> GDB yields the following trace on the child process when it's in the
> locked up state:
> 
> (gdb) info threads
>   3 thread 229483  0x00001f302beba75a in futex () at {standard input}:5
> * 2 thread 138274  0x00001f302be57ada in kevent () at {standard input}:5
>   1 thread 284521  0x00001f302beba75a in futex () at {standard input}:5
> (gdb) thread 1
> [Switching to thread 1 (thread 284521)]#0  0x00001f302beba75a in futex () at 
> {standard input}:5
> 5       {standard input}: No such file or directory.
>         in {standard input}
> (gdb) bt
> #0  0x00001f302beba75a in futex () at {standard input}:5
> #1  0x00001f2f9f30f8fb in _rthread_cond_timedwait (cond=0x1f2ff23fb1e0, 
> mutexp=0x1f2d469adea0, abs=0x0) at synch.h:43
> #2  0x00001f2d4670bff1 in run_vm (child_disks=0x1f2fcf57d51c, 
> child_taps=0x7f7fffff97f0, vmc=0x1f2fcf57d000, vrs=0x7f7fffff9610) at vm.c:825
> #3  0x00001f2d4670b18b in start_vm (vm=0x1f2fcf57d000, fd=12) at vm.c:334
> #4  0x00001f2d4670aacc in vmm_start_vm (imsg=0x7f7fffff9970, 
> id=0x7f7fffff9914) at vmm.c:545
> #5  0x00001f2d46709def in vmm_dispatch_parent (fd=3, p=0x1f2d4692d1a0, 
> imsg=0x7f7fffff9970) at vmm.c:135
> #6  0x00001f2d46707fee in proc_dispatch (fd=3, event=2, arg=0x1f2fdef2c000) 
> at proc.c:652
> #7  0x00001f301c04c2b8 in event_base_loop (base=0x1f2f7f196800, 
> flags=Variable "flags" is not available.
> ) at /usr/src/lib/libevent/event.c:350
> #8  0x00001f2d46707e52 in proc_run (ps=0x1f2ff23fa000, p=0x1f2d4692d0a0, 
> procs=0x1f2d4692d1a0, nproc=1, run=0x1f2d46709b8b <vmm_run>, arg=0x0) at 
> proc.c:594
> #9  0x00001f2d46709b89 in vmm (ps=0x1f2ff23fa000, p=0x1f2d4692d0a0) at 
> vmm.c:74
> #10 0x00001f2d46706ed9 in proc_init (ps=0x1f2ff23fa000, procs=0x1f2d4692d020, 
> nproc=3, argc=6, argv=0x7f7fffff9c58, proc_id=PROC_VMM) at proc.c:249
> #11 0x00001f2d4670227d in main (argc=0, argv=0x7f7fffff9c58) at vmd.c:482
> (gdb) thread 2
> [Switching to thread 2 (thread 138274)]#0  0x00001f302be57ada in kevent () at 
> {standard input}:5
> 5       {standard input}: No such file or directory.
>         in {standard input}
> (gdb) bt
> #0  0x00001f302be57ada in kevent () at {standard input}:5
> #1  0x00001f301c046ac0 in kq_dispatch (base=Variable "base" is not available.
> ) at /usr/src/lib/libevent/kqueue.c:193
> #2  0x00001f301c04c1e2 in event_base_loop (base=0x1f2ff23ff400, 
> flags=Variable "flags" is not available.
> ) at /usr/src/lib/libevent/event.c:493
> #3  0x00001f2d4670c18a in event_thread (arg=0x7f7fffff90bf) at vm.c:881
> #4  0x00001f2f9f31183e in _rthread_start (v=Variable "v" is not available.
> ) at /usr/src/lib/librthread/rthread.c:114
> #5  0x00001f302be698bb in __tfork_thread () at 
> /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
> #6  0x0000000000000000 in ?? ()
> (gdb) thread 3
> [Switching to thread 3 (thread 229483)]#0  0x00001f302beba75a in futex () at 
> {standard input}:5
> 5       {standard input}: No such file or directory.
>         in {standard input}
> (gdb) bt
> #0  0x00001f302beba75a in futex () at {standard input}:5
> #1  0x00001f2f9f30f8fb in _rthread_cond_timedwait (cond=0x1f2f497ed410, 
> mutexp=0x1f2d469adec0, abs=0x0) at synch.h:43
> #2  0x00001f2d4670c284 in vcpu_run_loop (arg=0x1f2f994b3760) at vm.c:926
> #3  0x00001f2f9f31183e in _rthread_start (v=Variable "v" is not available.
> ) at /usr/src/lib/librthread/rthread.c:114
> #4  0x00001f302be698bb in __tfork_thread () at 
> /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
> #5  0x0000000000000000 in ?? ()
> 
> A dmesg follows after my signature.
> 
> What can I do to debug this further?
> 
> -- 
>       Gregor
> 
> OpenBSD 6.1-current (GENERIC.MP) #77: Wed May 31 20:07:33 CEST 2017
>     [email protected]:/usr/obj/sys/arch/amd64/compile/GENERIC.MP
> real mem = 17078013952 (16286MB)
> avail mem = 16554565632 (15787MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x87ed7000 (44 entries)
> bios0: vendor American Megatrends Inc. version "5.11" date 08/29/2016
> bios0: Notebook N24_25JU
> acpi0 at bios0: rev 2
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP APIC FPDT FIDT MCFG HPET SSDT SSDT DBGP DBG2 SSDT 
> UEFI SSDT BGRT DMAR TPM2 ASF!
> acpi0: wakeup devices PEGP(S4) PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) 
> PXSX(S4) RP17(S4) PXSX(S4) RP18(S4) PXSX(S4) RP19(S4) PXSX(S4) RP20(S4) 
> PXSX(S4) RP01(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: TSC frequency 2400000000 Hz
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 23MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 1 (application processor)
> cpu2: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 1, core 0, package 0
> cpu3 at mainbus0: apid 3 (application processor)
> cpu3: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, 2400.00 MHz
> cpu3: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 1, core 1, package 0
> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 120 pins
> acpimcfg0 at acpi0 addr 0xe0000000, bus 0-255
> acpihpet0 at acpi0: 23999999 Hz
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus -1 (PEG0)
> acpiprt2 at acpi0: bus -1 (PEG1)
> acpiprt3 at acpi0: bus -1 (PEG2)
> acpiprt4 at acpi0: bus -1 (RP17)
> acpiprt5 at acpi0: bus -1 (RP18)
> acpiprt6 at acpi0: bus -1 (RP19)
> acpiprt7 at acpi0: bus -1 (RP20)
> acpiprt8 at acpi0: bus 1 (RP01)
> acpiprt9 at acpi0: bus 2 (RP02)
> acpiprt10 at acpi0: bus -1 (RP03)
> acpiprt11 at acpi0: bus -1 (RP04)
> acpiprt12 at acpi0: bus -1 (RP05)
> acpiprt13 at acpi0: bus -1 (RP06)
> acpiprt14 at acpi0: bus -1 (RP07)
> acpiprt15 at acpi0: bus -1 (RP08)
> acpiprt16 at acpi0: bus -1 (RP09)
> acpiprt17 at acpi0: bus -1 (RP10)
> acpiprt18 at acpi0: bus -1 (RP11)
> acpiprt19 at acpi0: bus -1 (RP12)
> acpiprt20 at acpi0: bus -1 (RP13)
> acpiprt21 at acpi0: bus -1 (RP14)
> acpiprt22 at acpi0: bus -1 (RP15)
> acpiprt23 at acpi0: bus -1 (RP16)
> acpiec0 at acpi0
> acpicpu0 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu1 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu2 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu3 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpipwrres0 at acpi0: PG00, resource for PEG0
> acpipwrres1 at acpi0: PG01, resource for PEG1
> acpipwrres2 at acpi0: PG02, resource for PEG2
> acpitz0 at acpi0: critical temperature is 120 degC
> "INT3F0D" at acpi0 not configured
> "MSFT0001" at acpi0 not configured
> "SYN1218" at acpi0 not configured
> "INT33A1" at acpi0 not configured
> "MSFT0101" at acpi0 not configured
> "PNPC000" at acpi0 not configured
> acpibtn0 at acpi0: PWRB
> acpibtn1 at acpi0: SLPB
> acpibtn2 at acpi0: LID0
> acpiac0 at acpi0: AC unit online
> acpibat0 at acpi0: BAT0 model "BAT" serial 0001 type LION oem "Notebook"
> "PNP0C14" at acpi0 not configured
> "INT340E" at acpi0 not configured
> acpivideo0 at acpi0: GFX0
> acpivout0 at acpivideo0: DD1F
> cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2301, 2300, 2200, 2100, 2000, 
> 1800, 1700, 1500, 1400, 1300, 1100, 1000, 800, 700, 500, 400 MHz
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel Core 6G Host" rev 0x08
> "Intel HD Graphics 520" rev 0x07 at pci0 dev 2 function 0 not configured
> xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x21: msi
> usb0 at xhci0: USB revision 3.0
> uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 
> addr 1
> "Intel 100 Series MEI" rev 0x21 at pci0 dev 22 function 0 not configured
> ahci0 at pci0 dev 23 function 0 "Intel 100 Series AHCI" rev 0x21: msi, AHCI 
> 1.3.1
> ahci0: port 0: 6.0Gb/s
> scsibus1 at ahci0: 32 targets
> sd0 at scsibus1 targ 0 lun 0: <ATA, CT240BX200SSD1, MU02> SCSI3 0/direct 
> fixed naa.500a0751f01c9d7a
> sd0: 228936MB, 512 bytes/sector, 468862128 sectors, thin
> ppb0 at pci0 dev 28 function 0 "Intel 100 Series PCIE" rev 0xf1: msi
> pci1 at ppb0 bus 1
> rtsx0 at pci1 dev 0 function 0 "Realtek RTL8411B Card Reader" rev 0x01: msi
> sdmmc0 at rtsx0: 4-bit
> re0 at pci1 dev 0 function 1 "Realtek 8168" rev 0x12: RTL8411B (0x5c80), msi, 
> address 80:fa:5b:33:76:b3
> rgephy0 at re0 phy 7: RTL8251 PHY, rev. 0
> ppb1 at pci0 dev 28 function 1 "Intel 100 Series PCIE" rev 0xf1: msi
> pci2 at ppb1 bus 2
> iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless AC 8260" rev 0x3a, msi
> pcib0 at pci0 dev 31 function 0 "Intel 100 Series LPC" rev 0x21
> "Intel 100 Series PMC" rev 0x21 at pci0 dev 31 function 2 not configured
> azalia0 at pci0 dev 31 function 3 "Intel 100 Series HD Audio" rev 0x21: msi
> azalia0: codecs: Realtek ALC269, Intel/0x2809, using Realtek ALC269
> audio0 at azalia0
> ichiic0 at pci0 dev 31 function 4 "Intel 100 Series SMBus" rev 0x21: apic 2 
> int 16
> iic0 at ichiic0
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5 irq 1 irq 12
> pckbd0 at pckbc0 (kbd slot)
> wskbd0 at pckbd0: console keyboard
> pms0 at pckbc0 (aux slot)
> wsmouse0 at pms0 mux 0
> pms0: Synaptics touchpad, firmware 8.2
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> vmm0 at mainbus0: VMX/EPT
> efifb0 at mainbus0: 1920x1080, 32bpp
> wsdisplay0 at efifb0 mux 1: console (std, vt100 emulation), using wskbd0
> wsdisplay0: screen 1-5 added (std, vt100 emulation)
> ugen0 at uhub0 port 5 "Intel Bluetooth" rev 2.00/0.01 addr 2
> uvideo0 at uhub0 port 6 configuration 1 interface 0 "Generic Chicony USB 2.0 
> Camera" rev 2.00/32.07 addr 3
> video0 at uvideo0
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> sd1 at scsibus3 targ 1 lun 0: <OPENBSD, SR CRYPTO, 006> SCSI2 0/direct fixed
> sd1: 228896MB, 512 bytes/sector, 468779647 sectors
> root on sd1a (927460f40d363126.a) swap on sd1b dump on sd1b
> iwm0: hw rev 0x200, fw ver 16.242414.0, address e4:a4:71:e1:45:79
> 

Reply via email to