out of swap on shutdown
On shutdown, on machines with a large tmpfs which has been thrashed, I see: UVM: pid 1029.1 (master), uid 0 killed: out of swap UVM: pid 701.1 (qmgr), uid 12 killed: out of swap with what appears as a hang. In the past, I have left a box in this state overnight with no change. (I think about a year ago this would panic instead.) The difference is that today I just happen to have a serial console plugged in. Box running yesterday's -current/amd64. Sending breaks shows that things aren't actually stuck: fatal breakpoint trap in supervisor mode trap type 1 code 0 rip 8024fa2d cs 8 rflags 202 cr2 7f7ff6c3c11a ilevel 8 rsp fe8043a05c50 curlwp 0xfe807fbb5020 pid 0.7 lowest kstack 0xfe8043a032c0 Stopped in pid 0.7 (system) at netbsd:breakpoint+0x5: leave breakpoint() at netbsd:breakpoint+0x5 comintr() at netbsd:comintr+0x524 Xintr_ioapic_edge9() at netbsd:Xintr_ioapic_edge9+0xea --- interrupt --- bus_space_read_4() at netbsd:bus_space_read_4+0xa mi_switch() at netbsd:mi_switch+0x1e sleepq_block() at netbsd:sleepq_block+0xa2 cv_wait() at netbsd:cv_wait+0x9a xc_thread() at netbsd:xc_thread+0x52 Selection of processes: PIDLID S CPU FLAGS STRUCT LWP * NAME WAIT 999 1 3 1 0 fe807d16bb20swapctl plpg 0 51 3 0 200 fe8043ee09a0raidio0 plpg 0 50 3 1 200 fe8043f215a0 raid0 rfnodeq 11 3 180 fe8043f21180 init wait 0 17 7 1 201 fe8043a478e0 idle/1 0 7 3 0 200 fe807fbb5020xcall/0 xcall 01 3 1 200 80ccb980swapper uvm --- interrupt --- uvmpdpol_pagedeactivate() at netbsd:uvmpdpol_pagedeactivate+0x22 uvmpdpol_balancequeue() at netbsd:uvmpdpol_balancequeue+0x87 uvm_pageout() at netbsd:uvm_pageout+0x3f8 --- interrupt --- _atomic_cas_64() at netbsd:_atomic_cas_64+0x8 uvmpd_trylockowner() at netbsd:uvmpd_trylockowner+0x1e uvm_pageout() at netbsd:uvm_pageout+0x1b9 --- interrupt --- sleepq_remove() at netbsd:sleepq_remove+0x84 sleepq_wake() at netbsd:sleepq_wake+0x43 uvm_pageout() at netbsd:uvm_pageout+0x43b --- interrupt --- uvm_pageactivate() at netbsd:uvm_pageactivate --- interrupt --- mutex_tryenter() at netbsd:mutex_tryenter+0x80 uvmpd_trylockowner() at netbsd:uvmpd_trylockowner+0x1e uvmpdpol_selectvictim() at netbsd:uvmpdpol_selectvictim+0x43 uvm_pageout() at netbsd:uvm_pageout+0x19d It looks as though swapctl wants memory? Thoughts on how to make a more useful bug report? Cheers, Patrick
Re: More amd64 drmkms radeon
Mine is also much better now - DRMKMS kernel boots into multiuser, switches the mode and works fine in multiuser. Xorg doesn't start; it blanks the screen and I presume panics, but I can't see anything; I will have to switch to serial console to see what is going on (I also had a panic from a KASSERT in igmp.c, which did panic on every shutdown/reboot/halt, but I commented it out and now it reboots cleanly). The dmesg follows: Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 7.99.1 (DRMKMS) #0: Fri Aug 15 13:50:02 BST 2014 root@support6.delcam.local:/root/a64/compile/DRMKMS total memory = 3071 MB avail memory = 2963 MB kern.module.path=/stand/amd64/7.99.1/modules timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter i8254 frequency 1193182 Hz quality 100 AMD Rhapsody (Rev 4) mainbus0 (root) ACPI: RSDP 0xf6fc0 24 (v02 PTLTD ) ACPI: XSDT 0xbff7b450 3C (v01 PTLTD ? XSDT 0604 LTP ) ACPI: FACP 0xbff7ee46 F4 (v03 AMDHAMMER 0604 PTEC 000F4240) ACPI: DSDT 0xbff7b48c 003946 (v01 AMD-K8 AMDACPI 0604 MSFT 010E) ACPI: FACS 0xbff7ffc0 40 ACPI: APIC 0xbff7ef3a 76 (v01 PTLTD ? APIC 0604 LTP ) ACPI: SPCR 0xbff7efb0 50 (v01 PTLTD $UCRTBL$ 0604 PTL 0001) ACPI: All ACPI Tables successfully acquired cpu0 at mainbus0 apid 0: AMD Opteron(tm) Processor 246, id 0xf5a cpu0: erratum 86 present cpu0: erratum 104 present cpu0: erratum 101 present cpu0: WARNING: errata present, BIOS upgrade may be cpu0: WARNING: necessary to ensure reliable operation cpu1 at mainbus0 apid 1: AMD Opteron(tm) Processor 246, id 0xf5a ioapic0 at mainbus0 apid 2: pa 0xfec0, version 0x11, 24 pins ioapic1 at mainbus0 apid 3: pa 0xf000, version 0x11, 4 pins ioapic2 at mainbus0 apid 4: pa 0xf0001000, version 0x11, 4 pins acpi0 at mainbus0: Intel ACPICA 20131218 acpi0: X/RSDT: OemId PTLTD , XSDT ,0604, AslId LTP, acpi0: SCI interrupting at int 9 timecounter: Timecounter ACPI-Fast frequency 3579545 Hz quality 1000 acpibut0 at acpi0 (PWRB, PNP0C0C): ACPI Power Button attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0 pcppi1 at acpi0 (SPKR, PNP0800): io 0x61 midi0 at pcppi1: PC speaker sysbeep0 at pcppi1 SYSR (PNP0C02) WARNING: module error: vfs load failed for `acpiverbose', error 45 at acpi0 not configured pckbc1 at acpi0 (PS2M, PNP0F13) (aux port): irq 12 pckbc2 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1 FDC0 (PNP0700) WARNING: module error: vfs load failed for `acpiverbose', error 45 at acpi0 not configured UAR1 (PNP0501) WARNING: module error: vfs load failed for `acpiverbose', error 45 at acpi0 not configured UAR2 (PNP0501) WARNING: module error: vfs load failed for `acpiverbose', error 45 at acpi0 not configured LPT (PNP0400) WARNING: module error: vfs load failed for `acpiverbose', error 45 at acpi0 not configured ACPI: Enabled 1 GPEs in block 00 to 0F ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20131218/hwxface-646) ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S3_] (20131218/hwxface-646) WARNING: module error: vfs load failed for `acpiverbose', error 45 attimer1: attached to pcppi1 pckbd0 at pckbc2 (kbd slot) pckbc2: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard pci0 at mainbus0 bus 0: configuration mode 1 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok pchb0 at pci0 dev 0 function 0: AMD AMD8151 AGP Device (rev. 0x13) agp0 at pchb0: 2 Miscellaneous Control unit(s) found. agp0: aperture at 0xe000, size 0x1000 ppb0 at pci0 dev 1 function 0: AMD AMD8151 AGP Bridge (rev. 0x13) pci1 at ppb0 bus 1 pci1: i/o space, memory space enabled radeon0 at pci1 dev 0 function 0: ATI Technologies FireGL T2 AT (rev. 0x80) ppb1 at pci0 dev 6 function 0: AMD AMD8111 I/O Hub (rev. 0x07) pci2 at ppb1 bus 2 pci2: i/o space, memory space enabled ohci0 at pci2 dev 0 function 0: AMD AMD8111 USB Host Controller (rev. 0x0b) csr: 02800017 ohci0: interrupting at ioapic0 pin 19 ohci0: OHCI version 1.0, legacy support usb0 at ohci0: USB revision 1.0 ohci1 at pci2 dev 0 function 1: AMD AMD8111 USB Host Controller (rev. 0x0b) csr: 02800017 ohci1: interrupting at ioapic0 pin 19 ohci1: OHCI version 1.0, legacy support usb1 at ohci1: USB revision 1.0 ohci2 at pci2 dev 4 function 0: NEC USB Host Controller (rev. 0x41) csr: 02100016 ohci2: interrupting at ioapic0 pin 17 ohci2: OHCI version 1.0 usb2 at ohci2: USB revision 1.0 ohci3 at pci2 dev 4 function 1: NEC USB Host Controller (rev. 0x41) csr: 02100016 ohci3: interrupting at ioapic0 pin 18 ohci3: OHCI version 1.0 usb3 at ohci3: USB revision 1.0 ehci0 at pci2 dev 4 function 2: NEC USB2 Host Controller (rev. 0x02) ehci0: interrupting at ioapic0 pin 19 ehci0: EHCI
Re: More amd64 drmkms radeon
On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote: Mine is also much better now - DRMKMS kernel boots into multiuser, switches the mode and works fine in multiuser. Xorg doesn't start; it blanks the screen and I presume panics, but I can't see anything; I I think X coredumps, but I have played hunt the core and haven't found it yet... (I can ssh in after screen goes blank) Cheers, Patrick
Re: More amd64 drmkms radeon
On Fri, Aug 15, 2014 at 02:55:37PM +0100, Patrick Welche wrote: On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote: Mine is also much better now - DRMKMS kernel boots into multiuser, switches the mode and works fine in multiuser. Xorg doesn't start; it blanks the screen and I presume panics, but I can't see anything; I I think X coredumps, but I have played hunt the core and haven't found it yet... (I can ssh in after screen goes blank) That was before Working file: external/mit/libdrm/dist/radeon/radeon_bo_gem.c revision 1.4 date: 2014/08/14 20:56:10; author: mrg; state: Exp; lines: +2 -2 convert an mmap() to drmMap(). P
Re: More amd64 drmkms radeon
On Fri, Aug 15, 2014 at 02:55:37PM +0100, Patrick Welche wrote: On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote: Mine is also much better now - DRMKMS kernel boots into multiuser, switches the mode and works fine in multiuser. Xorg doesn't start; it blanks the screen and I presume panics, but I can't see anything; I I think X coredumps, but I have played hunt the core and haven't found it yet... (I can ssh in after screen goes blank) Got it - but no symbols: Program terminated with signal SIGABRT, Aborted. #0 0x7f7ff430d32a in _lwp_kill () from /usr/lib/libc.so.12 (gdb) bt #0 0x7f7ff430d32a in _lwp_kill () from /usr/lib/libc.so.12 #1 0x7f7ff430cfb5 in abort () from /usr/lib/libc.so.12 #2 0x0054ad60 in OsAbort () #3 0x00464707 in ddxGiveUp () #4 0x0054819d in AbortServer () #5 0x0054853b in FatalError () #6 0x0054b578 in ?? () #7 signal handler called #8 0x7f7ff430ce74 in memcpy () from /usr/lib/libc.so.12 #9 0x7f7fee60ecee in ?? () from /usr/X11R7/lib/modules/libexa.so #10 0x7f7fee60f114 in ?? () from /usr/X11R7/lib/modules/libexa.so #11 0x7f7fee60ddad in exaPrepareAccessReg_mixed () from /usr/X11R7/lib/modules/libexa.so #12 0x7f7fee6057f7 in ExaCheckPolyGlyphBlt () from /usr/X11R7/lib/modules/libexa.so #13 0x004b85f8 in miPolyText8 () #14 0x0050622b in ?? () #15 0x0044cc5a in doPolyText () #16 0x0044dada in PolyText () #17 0x00451007 in ProcPolyText () #18 0x004534b1 in Dispatch () #19 0x0059d0da in main () (yesterday's -current/amd64 Device Name: Radeon X600 PCI Express (0x5b62) aka RV380) Cheers, Patrick
Re: More amd64 drmkms radeon
Patrick Welche wrote: On Fri, Aug 15, 2014 at 02:55:37PM +0100, Patrick Welche wrote: On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote: Mine is also much better now - DRMKMS kernel boots into multiuser, switches the mode and works fine in multiuser. Xorg doesn't start; it blanks the screen and I presume panics, but I can't see anything; I I think X coredumps, but I have played hunt the core and haven't found it yet... (I can ssh in after screen goes blank) That was before Working file: external/mit/libdrm/dist/radeon/radeon_bo_gem.c revision 1.4 date: 2014/08/14 20:56:10; author: mrg; state: Exp; lines: +2 -2 convert an mmap() to drmMap(). What does it do now ? I get a panic in a call to munmap(2) but that may just be happening when the server is cleaning up from some other error. Robert Swindells
Re: amd64 DRMKMS (ivy bridge) results, 14 Jun 2014
On Thu, Aug 14, 2014 at 07:09:46PM +0200, Piotr Meyer wrote: Ivy Bridge integrated GPU, fresh sources: 1. Now, I got valid resolution on console and in X (1280x1024). Much better. ;) 2. glxgears leads to panic, after ddb.onpanic=1 I got: Some additional debug info, hope this helps a little: #v+ Reading symbols from /netbsd.gdb...done. Load new symbol table from /netbsd.gdb? (y or n) y Reading symbols from /netbsd.gdb...done. 0x8067a5bf in cpu_reboot (howto=howto@entry=256, bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:671 671 dumpsys(); #0 0x8067a5bf in cpu_reboot (howto=howto@entry=256, bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:671 #1 0x8026 in db_sync_cmd (addr=optimized out, have_addr=optimized out, count=optimized out, modif=optimized out) at /usr/src/sys/ddb/db_command.c:1358 #2 0x802ab267 in db_command (last_cmdp=last_cmdp@entry=0x8109e180 db_last_command) at /usr/src/sys/ddb/db_command.c:907 #3 0x802ab5f4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:565 #4 0x802b0d0d in db_trap (type=type@entry=1, code=code@entry=0) at /usr/src/sys/ddb/db_trap.c:90 #5 0x802adf80 in kdb_trap (type=type@entry=1, code=code@entry=0, regs=regs@entry=0xfe80cefffc30) at /usr/src/sys/arch/amd64/amd64/db_interface.c:227 #6 0x809157c2 in trap (frame=0xfe80cefffc30) at /usr/src/sys/arch/amd64/amd64/trap.c:287 #7 0x80100fde in alltraps () #8 0x8028bc5d in breakpoint () #9 0x808c6f90 in vpanic (fmt=0x80d34318 kernel %sassertion \%s\ failed: file \%s\, line %d , ap=ap@entry=0xfe80cefffd60) at /usr/src/sys/kern/subr_prf.c:338 #10 0x80a62393 in kern_assert (fmt=fmt@entry=0x80d34318 kernel %sassertion \%s\ failed: file \%s\, line %d ) at /usr/src/sys/lib/libkern/kern_assert.c:51 #11 0x803b27ec in DRM_SPIN_WAITERS_P (interlock=0x86b03ca0, q=0x86b03688) at /usr/src/sys/external/bsd/drm2/include/drm/drm_wait_netbsd.h:89 #12 i915_record_ring_state (ering=0xfe811cbf7a10, ring=0x86b03560, dev=0xfe8106bc3010) at /usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_gpu_error.c:812 #13 i915_gem_record_rings (error=0xfe811cbf7810, dev=0xfe8106bc3010) at /usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_gpu_error.c:924 #14 i915_capture_error_state (dev=0xfe8106bc3010, wedged=optimized out, error_msg=0xfe80cefffe88 Ring hung) at /usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_gpu_error.c:1191 #15 0x803b763f in i915_handle_error (dev=0xfe8106bc3010, wedged=optimized out, fmt=optimized out) at /usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_irq.c:2364 #16 0x80618752 in callout_softclock (v=optimized out) at /usr/src/sys/kern/kern_timeout.c:736 #17 0x8060d338 in softint_execute (l=optimized out, s=2, si=0x800042a310c0) at /usr/src/sys/kern/kern_softint.c:589 #18 softint_dispatch (pinned=error reading variable: Cannot access memory at address 0xfe80cfc0, s=2) at /usr/src/sys/kern/kern_softint.c:871 #19 0x8011412f in Xsoftintr () /usr/src/external/gpl3/gdb/dist/gdb/frame.c:472: internal-error: get_frame_id: Assertion `fi-this_id.p' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) y /usr/src/external/gpl3/gdb/dist/gdb/frame.c:472: internal-error: get_frame_id: Assertion `fi-this_id.p' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Create a core file of GDB? (y or n) n #v- Kernel running w/o DIAGNOSTIC/DEBUG gave me: #v+ drm: stuck on render ring drm: GPU HANG: ecode 0:0x86ed, reason: Ring hung, action: reset drm: GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. drm: Please file a _new_ bug report on bugs.freedesktop.org against DRI - DRM/Intel drm: drm/i915 developers can then reassign to the right component if it's not a kernel issue. drm: The gpu crash dump is required to analyze gpu hangs, so please always attach it. drm: GPU crash dump saved to /sys/class/drm/card0/error i915drmkms0: interrupting at ioapic0 pin 16 (pci0@pci::00:02.0) drm: Enabling RC6 states: RC6 on, RC6p on, RC6pp off drm: stuck on render ring drm: GPU HANG: ecode 0:0x86ed7c1f, reason: Ring hung, action: reset DRM error in i915_context_is_banned: gpu hanging too fast, banning! i915drmkms0: interrupting at ioapic0 pin 16 (pci0@pci::00:02.0) drm: Enabling RC6 states: RC6 on, RC6p on, RC6pp off #v- Regards, -- Piotr 'aniou' Meyer
Re: Building PCC for tools is broken (missing symbol __USE)- PCC bug or NetBSD source tree error?
On Wed, 13 Aug 2014, William D. Jones wrote: The error to which I refer to (cannot find -lgcc) also occurs now, even when I set HAVE_PCC=1 while building libc... it seems that there is a depedency problem that has crept into the NetBSD source tree the past few days, because me receiving complaints about a missing libgcc when only building the PCC tools only started in the 2 to 4 days. I am not sure about this recent addition, you might have a contaminated objdir, do you start from empty dir? I am thinking about this setup you are trying for.. Firstly, if GCC is being used to build something, then I think that it will always add -lgcc to the linker command. This is because it uses that to provide support for specific features (the code it produces calls routines in there to do things, such as floating point math) Then, in lib/Makefile, we don't build libgcc if MKGCC==no but perhaps we should (if ACTIVE_CC==gcc then it will be needed during the build if not the runtime) if you force libgcc to be built (comment out the MKGCC conditional in lib/Makefile) will it continue? also, you should probably use HAVE_GCC=48 since that is the version in use, and the version number is checked sometimes for various features iain