out of swap on shutdown

2014-08-15 Thread Patrick Welche
On shutdown, on machines with a large tmpfs which has been thrashed, I
see:

UVM: pid 1029.1 (master), uid 0 killed: out of swap
UVM: pid 701.1 (qmgr), uid 12 killed: out of swap

with what appears as a hang. In the past, I have left a box in this
state overnight with no change.

(I think about a year ago this would panic instead.) The difference is that
today I just happen to have a serial console plugged in. Box running
yesterday's -current/amd64. 

Sending breaks shows that things aren't actually stuck:

fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 8024fa2d cs 8 rflags 202 cr2 7f7ff6c3c11a ilevel 
8 rsp fe8043a05c50
curlwp 0xfe807fbb5020 pid 0.7 lowest kstack 0xfe8043a032c0
Stopped in pid 0.7 (system) at  netbsd:breakpoint+0x5:  leave
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x524
Xintr_ioapic_edge9() at netbsd:Xintr_ioapic_edge9+0xea
--- interrupt ---
bus_space_read_4() at netbsd:bus_space_read_4+0xa
mi_switch() at netbsd:mi_switch+0x1e
sleepq_block() at netbsd:sleepq_block+0xa2
cv_wait() at netbsd:cv_wait+0x9a
xc_thread() at netbsd:xc_thread+0x52

Selection of processes:
PIDLID S CPU FLAGS   STRUCT LWP *   NAME WAIT
999  1 3   1 0   fe807d16bb20swapctl plpg
0   51 3   0   200   fe8043ee09a0raidio0 plpg
0   50 3   1   200   fe8043f215a0  raid0 rfnodeq
11 3   180   fe8043f21180   init wait
0  17 7   1   201   fe8043a478e0 idle/1
0   7 3   0   200   fe807fbb5020xcall/0 xcall
01 3   1   200   80ccb980swapper uvm

--- interrupt ---
uvmpdpol_pagedeactivate() at netbsd:uvmpdpol_pagedeactivate+0x22
uvmpdpol_balancequeue() at netbsd:uvmpdpol_balancequeue+0x87
uvm_pageout() at netbsd:uvm_pageout+0x3f8

--- interrupt ---
_atomic_cas_64() at netbsd:_atomic_cas_64+0x8
uvmpd_trylockowner() at netbsd:uvmpd_trylockowner+0x1e
uvm_pageout() at netbsd:uvm_pageout+0x1b9

--- interrupt ---
sleepq_remove() at netbsd:sleepq_remove+0x84
sleepq_wake() at netbsd:sleepq_wake+0x43
uvm_pageout() at netbsd:uvm_pageout+0x43b

--- interrupt ---
uvm_pageactivate() at netbsd:uvm_pageactivate

--- interrupt ---
mutex_tryenter() at netbsd:mutex_tryenter+0x80
uvmpd_trylockowner() at netbsd:uvmpd_trylockowner+0x1e
uvmpdpol_selectvictim() at netbsd:uvmpdpol_selectvictim+0x43
uvm_pageout() at netbsd:uvm_pageout+0x19d


It looks as though swapctl wants memory?
Thoughts on how to make a more useful bug report?

Cheers,

Patrick


Re: More amd64 drmkms radeon

2014-08-15 Thread Chavdar Ivanov
Mine is also much better now - DRMKMS kernel boots into multiuser,
switches the mode and works fine in multiuser. Xorg doesn't start; it
blanks the screen and I presume panics, but I can't see anything; I
will have to switch to serial console to see what is going on (I also
had a panic from a KASSERT in igmp.c, which did panic on every
shutdown/reboot/halt, but I commented it out and now it reboots
cleanly).

The dmesg follows:

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 7.99.1 (DRMKMS) #0: Fri Aug 15 13:50:02 BST 2014
root@support6.delcam.local:/root/a64/compile/DRMKMS
total memory = 3071 MB
avail memory = 2963 MB
kern.module.path=/stand/amd64/7.99.1/modules
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter i8254 frequency 1193182 Hz quality 100
AMD Rhapsody (Rev 4)
mainbus0 (root)
ACPI: RSDP 0xf6fc0 24 (v02 PTLTD )
ACPI: XSDT 0xbff7b450 3C (v01 PTLTD  ? XSDT   0604  LTP )
ACPI: FACP 0xbff7ee46 F4 (v03 AMDHAMMER   0604 PTEC 000F4240)
ACPI: DSDT 0xbff7b48c 003946 (v01 AMD-K8  AMDACPI 0604 MSFT 010E)
ACPI: FACS 0xbff7ffc0 40
ACPI: APIC 0xbff7ef3a 76 (v01 PTLTD  ? APIC   0604  LTP )
ACPI: SPCR 0xbff7efb0 50 (v01 PTLTD  $UCRTBL$ 0604 PTL  0001)
ACPI: All ACPI Tables successfully acquired
cpu0 at mainbus0 apid 0: AMD Opteron(tm) Processor 246, id 0xf5a
cpu0: erratum 86 present
cpu0: erratum 104 present
cpu0: erratum 101 present
cpu0: WARNING: errata present, BIOS upgrade may be
cpu0: WARNING: necessary to ensure reliable operation
cpu1 at mainbus0 apid 1: AMD Opteron(tm) Processor 246, id 0xf5a
ioapic0 at mainbus0 apid 2: pa 0xfec0, version 0x11, 24 pins
ioapic1 at mainbus0 apid 3: pa 0xf000, version 0x11, 4 pins
ioapic2 at mainbus0 apid 4: pa 0xf0001000, version 0x11, 4 pins
acpi0 at mainbus0: Intel ACPICA 20131218
acpi0: X/RSDT: OemId PTLTD , XSDT  ,0604, AslId  LTP,
acpi0: SCI interrupting at int 9
timecounter: Timecounter ACPI-Fast frequency 3579545 Hz quality 1000
acpibut0 at acpi0 (PWRB, PNP0C0C): ACPI Power Button
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
SYSR (PNP0C02) WARNING: module error: vfs load failed for
`acpiverbose', error 45
at acpi0 not configured
pckbc1 at acpi0 (PS2M, PNP0F13) (aux port): irq 12
pckbc2 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1
FDC0 (PNP0700) WARNING: module error: vfs load failed for
`acpiverbose', error 45
at acpi0 not configured
UAR1 (PNP0501) WARNING: module error: vfs load failed for
`acpiverbose', error 45
at acpi0 not configured
UAR2 (PNP0501) WARNING: module error: vfs load failed for
`acpiverbose', error 45
at acpi0 not configured
LPT (PNP0400) WARNING: module error: vfs load failed for `acpiverbose', error 45
at acpi0 not configured
ACPI: Enabled 1 GPEs in block 00 to 0F
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_]
(20131218/hwxface-646)
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S3_]
(20131218/hwxface-646)
WARNING: module error: vfs load failed for `acpiverbose', error 45
attimer1: attached to pcppi1
pckbd0 at pckbc2 (kbd slot)
pckbc2: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: AMD AMD8151 AGP Device (rev. 0x13)
agp0 at pchb0: 2 Miscellaneous Control unit(s) found.
agp0: aperture at 0xe000, size 0x1000
ppb0 at pci0 dev 1 function 0: AMD AMD8151 AGP Bridge (rev. 0x13)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
radeon0 at pci1 dev 0 function 0: ATI Technologies FireGL T2 AT (rev. 0x80)
ppb1 at pci0 dev 6 function 0: AMD AMD8111 I/O Hub (rev. 0x07)
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled
ohci0 at pci2 dev 0 function 0: AMD AMD8111 USB Host Controller (rev. 0x0b)
csr: 02800017
ohci0: interrupting at ioapic0 pin 19
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
ohci1 at pci2 dev 0 function 1: AMD AMD8111 USB Host Controller (rev. 0x0b)
csr: 02800017
ohci1: interrupting at ioapic0 pin 19
ohci1: OHCI version 1.0, legacy support
usb1 at ohci1: USB revision 1.0
ohci2 at pci2 dev 4 function 0: NEC USB Host Controller (rev. 0x41)
csr: 02100016
ohci2: interrupting at ioapic0 pin 17
ohci2: OHCI version 1.0
usb2 at ohci2: USB revision 1.0
ohci3 at pci2 dev 4 function 1: NEC USB Host Controller (rev. 0x41)
csr: 02100016
ohci3: interrupting at ioapic0 pin 18
ohci3: OHCI version 1.0
usb3 at ohci3: USB revision 1.0
ehci0 at pci2 dev 4 function 2: NEC USB2 Host Controller (rev. 0x02)
ehci0: interrupting at ioapic0 pin 19
ehci0: EHCI 

Re: More amd64 drmkms radeon

2014-08-15 Thread Patrick Welche
On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote:
 Mine is also much better now - DRMKMS kernel boots into multiuser,
 switches the mode and works fine in multiuser. Xorg doesn't start; it
 blanks the screen and I presume panics, but I can't see anything; I

I think X coredumps, but I have played hunt the core and haven't found
it yet... (I can ssh in after screen goes blank)

Cheers,

Patrick


Re: More amd64 drmkms radeon

2014-08-15 Thread Patrick Welche
On Fri, Aug 15, 2014 at 02:55:37PM +0100, Patrick Welche wrote:
 On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote:
  Mine is also much better now - DRMKMS kernel boots into multiuser,
  switches the mode and works fine in multiuser. Xorg doesn't start; it
  blanks the screen and I presume panics, but I can't see anything; I
 
 I think X coredumps, but I have played hunt the core and haven't found
 it yet... (I can ssh in after screen goes blank)

That was before

Working file: external/mit/libdrm/dist/radeon/radeon_bo_gem.c
revision 1.4
date: 2014/08/14 20:56:10;  author: mrg;  state: Exp;  lines: +2 -2
convert an mmap() to drmMap().


P


Re: More amd64 drmkms radeon

2014-08-15 Thread Patrick Welche
On Fri, Aug 15, 2014 at 02:55:37PM +0100, Patrick Welche wrote:
 On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote:
  Mine is also much better now - DRMKMS kernel boots into multiuser,
  switches the mode and works fine in multiuser. Xorg doesn't start; it
  blanks the screen and I presume panics, but I can't see anything; I
 
 I think X coredumps, but I have played hunt the core and haven't found
 it yet... (I can ssh in after screen goes blank)

Got it - but no symbols:

Program terminated with signal SIGABRT, Aborted.
#0  0x7f7ff430d32a in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x7f7ff430d32a in _lwp_kill () from /usr/lib/libc.so.12
#1  0x7f7ff430cfb5 in abort () from /usr/lib/libc.so.12
#2  0x0054ad60 in OsAbort ()
#3  0x00464707 in ddxGiveUp ()
#4  0x0054819d in AbortServer ()
#5  0x0054853b in FatalError ()
#6  0x0054b578 in ?? ()
#7  signal handler called
#8  0x7f7ff430ce74 in memcpy () from /usr/lib/libc.so.12
#9  0x7f7fee60ecee in ?? () from /usr/X11R7/lib/modules/libexa.so
#10 0x7f7fee60f114 in ?? () from /usr/X11R7/lib/modules/libexa.so
#11 0x7f7fee60ddad in exaPrepareAccessReg_mixed ()
   from /usr/X11R7/lib/modules/libexa.so
#12 0x7f7fee6057f7 in ExaCheckPolyGlyphBlt ()
   from /usr/X11R7/lib/modules/libexa.so
#13 0x004b85f8 in miPolyText8 ()
#14 0x0050622b in ?? ()
#15 0x0044cc5a in doPolyText ()
#16 0x0044dada in PolyText ()
#17 0x00451007 in ProcPolyText ()
#18 0x004534b1 in Dispatch ()
#19 0x0059d0da in main ()

(yesterday's -current/amd64
Device Name: Radeon X600 PCI Express (0x5b62)
aka RV380)

Cheers,

Patrick


Re: More amd64 drmkms radeon

2014-08-15 Thread Robert Swindells

Patrick Welche wrote:
On Fri, Aug 15, 2014 at 02:55:37PM +0100, Patrick Welche wrote:
 On Fri, Aug 15, 2014 at 02:48:47PM +0100, Chavdar Ivanov wrote:
  Mine is also much better now - DRMKMS kernel boots into multiuser,
  switches the mode and works fine in multiuser. Xorg doesn't start; it
  blanks the screen and I presume panics, but I can't see anything; I
 
 I think X coredumps, but I have played hunt the core and haven't found
 it yet... (I can ssh in after screen goes blank)

That was before

Working file: external/mit/libdrm/dist/radeon/radeon_bo_gem.c
revision 1.4
date: 2014/08/14 20:56:10;  author: mrg;  state: Exp;  lines: +2 -2
convert an mmap() to drmMap().

What does it do now ?

I get a panic in a call to munmap(2) but that may just be happening when
the server is cleaning up from some other error.

Robert Swindells


Re: amd64 DRMKMS (ivy bridge) results, 14 Jun 2014

2014-08-15 Thread Piotr Meyer
On Thu, Aug 14, 2014 at 07:09:46PM +0200, Piotr Meyer wrote:
 Ivy Bridge integrated GPU, fresh sources:
 
 1. Now, I got valid resolution on console and in X (1280x1024). 
Much better. ;)
 
 2. glxgears leads to panic, after ddb.onpanic=1 I got:
 
Some additional debug info, hope this helps a little:


#v+
Reading symbols from /netbsd.gdb...done.
Load new symbol table from /netbsd.gdb? (y or n) y
Reading symbols from /netbsd.gdb...done.
0x8067a5bf in cpu_reboot (howto=howto@entry=256, 
bootstr=bootstr@entry=0x0)
at /usr/src/sys/arch/amd64/amd64/machdep.c:671
671 dumpsys();
#0  0x8067a5bf in cpu_reboot (howto=howto@entry=256, 
bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:671
#1  0x8026 in db_sync_cmd (addr=optimized out, 
have_addr=optimized out, count=optimized out, modif=optimized out) at 
/usr/src/sys/ddb/db_command.c:1358
#2  0x802ab267 in db_command 
(last_cmdp=last_cmdp@entry=0x8109e180 db_last_command) at 
/usr/src/sys/ddb/db_command.c:907
#3  0x802ab5f4 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:565
#4  0x802b0d0d in db_trap (type=type@entry=1, code=code@entry=0) at 
/usr/src/sys/ddb/db_trap.c:90
#5  0x802adf80 in kdb_trap (type=type@entry=1, code=code@entry=0, 
regs=regs@entry=0xfe80cefffc30) at 
/usr/src/sys/arch/amd64/amd64/db_interface.c:227
#6  0x809157c2 in trap (frame=0xfe80cefffc30) at 
/usr/src/sys/arch/amd64/amd64/trap.c:287
#7  0x80100fde in alltraps ()
#8  0x8028bc5d in breakpoint ()
#9  0x808c6f90 in vpanic (fmt=0x80d34318 kernel %sassertion 
\%s\ failed: file \%s\, line %d , ap=ap@entry=0xfe80cefffd60)
at /usr/src/sys/kern/subr_prf.c:338
#10 0x80a62393 in kern_assert (fmt=fmt@entry=0x80d34318 kernel 
%sassertion \%s\ failed: file \%s\, line %d )
at /usr/src/sys/lib/libkern/kern_assert.c:51
#11 0x803b27ec in DRM_SPIN_WAITERS_P (interlock=0x86b03ca0, 
q=0x86b03688) at 
/usr/src/sys/external/bsd/drm2/include/drm/drm_wait_netbsd.h:89
#12 i915_record_ring_state (ering=0xfe811cbf7a10, ring=0x86b03560, 
dev=0xfe8106bc3010) at 
/usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_gpu_error.c:812
#13 i915_gem_record_rings (error=0xfe811cbf7810, dev=0xfe8106bc3010) at 
/usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_gpu_error.c:924
#14 i915_capture_error_state (dev=0xfe8106bc3010, wedged=optimized out, 
error_msg=0xfe80cefffe88 Ring hung)
at /usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_gpu_error.c:1191
#15 0x803b763f in i915_handle_error (dev=0xfe8106bc3010, 
wedged=optimized out, fmt=optimized out)
at /usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_irq.c:2364
#16 0x80618752 in callout_softclock (v=optimized out) at 
/usr/src/sys/kern/kern_timeout.c:736
#17 0x8060d338 in softint_execute (l=optimized out, s=2, 
si=0x800042a310c0) at /usr/src/sys/kern/kern_softint.c:589
#18 softint_dispatch (pinned=error reading variable: Cannot access memory at 
address 0xfe80cfc0, s=2) at /usr/src/sys/kern/kern_softint.c:871
#19 0x8011412f in Xsoftintr ()
/usr/src/external/gpl3/gdb/dist/gdb/frame.c:472: internal-error: get_frame_id: 
Assertion `fi-this_id.p' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y
/usr/src/external/gpl3/gdb/dist/gdb/frame.c:472: internal-error: get_frame_id: 
Assertion `fi-this_id.p' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
#v-


Kernel running w/o DIAGNOSTIC/DEBUG gave me:


#v+
drm: stuck on render ring
drm: GPU HANG: ecode 0:0x86ed, reason: Ring hung, action: reset
drm: GPU hangs can indicate a bug anywhere in the entire gfx stack, including 
userspace.
drm: Please file a _new_ bug report on bugs.freedesktop.org against DRI - 
DRM/Intel
drm: drm/i915 developers can then reassign to the right component if it's not a 
kernel issue.
drm: The gpu crash dump is required to analyze gpu hangs, so please always 
attach it.
drm: GPU crash dump saved to /sys/class/drm/card0/error
i915drmkms0: interrupting at ioapic0 pin 16 (pci0@pci::00:02.0)
drm: Enabling RC6 states: RC6 on, RC6p on, RC6pp off
drm: stuck on render ring
drm: GPU HANG: ecode 0:0x86ed7c1f, reason: Ring hung, action: reset
DRM error in i915_context_is_banned: gpu hanging too fast, banning!
i915drmkms0: interrupting at ioapic0 pin 16 (pci0@pci::00:02.0)
drm: Enabling RC6 states: RC6 on, RC6p on, RC6pp off
#v-

Regards,
-- 
Piotr 'aniou' Meyer


Re: Building PCC for tools is broken (missing symbol __USE)- PCC bug or NetBSD source tree error?

2014-08-15 Thread Iain Hibbert
On Wed, 13 Aug 2014, William D. Jones wrote:

 The error to which I refer to (cannot find -lgcc) also occurs now, even when I
 set HAVE_PCC=1 while building libc... it seems that there is a depedency
 problem that has crept into the NetBSD source tree the past few days, because
 me receiving complaints about a missing libgcc when only building the PCC
 tools only started in the 2 to 4 days.

I am not sure about this recent addition, you might have a contaminated
objdir, do you start from empty dir?

I am thinking about this setup you are trying for..

Firstly, if GCC is being used to build something, then I think that it
will always add -lgcc to the linker command. This is because it
uses that to provide support for specific features (the code it produces
calls routines in there to do things, such as floating point math)

Then, in lib/Makefile, we don't build libgcc if MKGCC==no but perhaps we
should (if ACTIVE_CC==gcc then it will be needed during the build if not
the runtime)

if you force libgcc to be built (comment out the MKGCC conditional in
lib/Makefile) will it continue?

also, you should probably use HAVE_GCC=48 since that is the version in
use, and the version number is checked sometimes for various features

iain