uvm_fault, possibly fstat(1) related

2009-12-27 Thread John Cosimano
i was troubleshooting some work i'm doing with python, and ran fstat(1),
as root, with no arguments.

machine is rack-mounted in a data center, far away from me, but with an
assist, i was able to get the following screen cap:

http://imgur.com/kdbaW.jpg

here's the text included therein:

uvm_fault(0xd8227010, 0x0, 0, 1) - e
kernel: page fault trap, code=0
Stopped at fill_file2+0x346: movl 0x44(%edx),%eax
ddb{0}

unfortunately, due to some problems on my end, i wasn't able to get a
remote console. after the machine came back up, i ran fstat as an
unprivileged user, again with no arguments and the host crashed again.
no screen cap/scrape from the 2nd lock, however. as soon as i get my
KVM/java nonsense sorted, i'll be able to test further.

but for now, i'll not be running fstat(1) again, but i have a gut feeling
this is symptomatic of something else, that probably needs attention
sooner rather than later.

i saw in the archive that Stuart Henderson posted something similar a
while back, but didn't see any follow-up. any hints to get me looking in
the right direction?

i'm running 4.6-STABLE, SMP kernel. dmesg attached:

OpenBSD 4.6 (build) #5: Thu Nov 12 10:28:47 EST 2009
j...@cosmicnetworks.net:/usr/local/build
cpu0: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz (GenuineIntel 686-class) 2.41 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,xTPR
real mem  = 2145677312 (2046MB)
avail mem = 2065956864 (1970MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 09/17/07, BIOS32 rev. 0 @ 0xf0010, SMBIOS 
rev. 2.5 @ 0xfd620 (27 entries)
bios0: vendor American Megatrends Inc. version 080012 date 09/17/2007
bios0: PCCHIPS P17G
acpi0 at bios0: rev 0
acpi0: tables DSDT FACP APIC MCFG OEMB HPET
acpi0: wakeup devices P0P2(S4) P0P1(S4) WOL_(S4) PS2K(S3) PS2M(S3) UAR1(S4) 
EUSB(S4) MC97(S4) USB0(S3) USB1(S3) USB2(S3) USB3(S3) P0P4(S4) P0P5(S4) 
P0P6(S4) P0P7(S4) P0P8(S4) P0P9(S4) SLPB(S4) PWRB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 266MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz (GenuineIntel 686-class) 2.40 GHz
cpu1: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,xTPR
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (P0P1)
acpiprt2 at acpi0: bus -1 (P0P4)
acpiprt3 at acpi0: bus -1 (P0P5)
acpiprt4 at acpi0: bus -1 (P0P6)
acpiprt5 at acpi0: bus -1 (P0P7)
acpiprt6 at acpi0: bus -1 (P0P8)
acpiprt7 at acpi0: bus -1 (P0P9)
acpicpu0 at acpi0: PSS
acpicpu1 at acpi0: PSS
acpitz0 at acpi0: critical temperature 60 degC
acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
bios0: ROM list: 0xc/0xae00!
cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2394, 1596 MHz
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 Intel 82945G Host rev 0x02
vga1 at pci0 dev 2 function 0 Intel 82945G Video rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
intagp0 at vga1
agp0 at intagp0: aperture at 0xd000, size 0x1000
inteldrm0 at vga1: apic 2 int 16 (irq 10)
drm0 at inteldrm0
uhci0 at pci0 dev 29 function 0 Intel 82801GB USB rev 0x01: apic 2 int 23 
(irq 5)
uhci1 at pci0 dev 29 function 1 Intel 82801GB USB rev 0x01: apic 2 int 19 
(irq 11)
uhci2 at pci0 dev 29 function 2 Intel 82801GB USB rev 0x01: apic 2 int 18 
(irq 15)
uhci3 at pci0 dev 29 function 3 Intel 82801GB USB rev 0x01: apic 2 int 16 
(irq 10)
ehci0 at pci0 dev 29 function 7 Intel 82801GB USB rev 0x01: apic 2 int 23 
(irq 5)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb0 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0xe1
pci1 at ppb0 bus 1
rl0 at pci1 dev 5 function 0 Realtek 8139 rev 0x10: apic 2 int 20 (irq 3), 
address 00:19:21:4d:b2:07
rlphy0 at rl0 phy 0: RTL internal PHY
ichpcib0 at pci0 dev 31 function 0 Intel 82801GB LPC rev 0x01: PM disabled
pciide0 at pci0 dev 31 function 1 Intel 82801GB IDE rev 0x01: DMA, channel 0 
configured to compatibility, channel 1 configured to compatibility
pciide0: channel 0 disabled (no drives)
pciide0: channel 1 ignored (disabled)
pciide1 at pci0 dev 31 function 2 Intel 82801GB SATA rev 0x01: DMA, channel 0 
configured to native-PCI, channel 1 configured to native-PCI
pciide1: using apic 2 int 19 (irq 11) for native-PCI interrupt
wd0 at pciide1 channel 0 drive 0: WDC WD2500AAKS-22B3A0
wd0: 16-sector PIO, LBA48, 238475MB, 488397168 sectors
wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 5
ichiic0 at pci0 dev 31 function 3 Intel 82801GB SMBus rev 0x01: apic 2 int 19 
(irq 10)
iic0 at ichiic0

Re: uvm_fault, possibly fstat(1) related

2009-12-27 Thread Philip Guenther
On Sun, Dec 27, 2009 at 10:25 AM, John Cosimano j...@cosmicnetworks.net
wrote:
 i was troubleshooting some work i'm doing with python, and ran fstat(1),
 as root, with no arguments.
...
 uvm_fault(0xd8227010, 0x0, 0, 1) - e
 kernel: page fault trap, code=0
 Stopped at fill_file2+0x346: movl 0x44(%edx),%eax
 ddb{0}

That would seem to say that fstat's sysctl() call caught a process
with an IPv4 or IPv6 socket with a non-NULL so_pcb that pointed
somewhere bogus.  That suggests that more information about what's
uncommon about your setup is in order.  Weird socket tricks?  NAT
combined with redirections in your pf.conf?  Transparent proxies?
Divert sockets?

...
 i saw in the archive that Stuart Henderson posted something similar a
 while back, but didn't see any follow-up. any hints to get me looking in
 the right direction?

You have a reference to his post handy?


 i'm running 4.6-STABLE, SMP kernel. dmesg attached:

 OpenBSD 4.6 (build) #5: Thu Nov 12 10:28:47 EST 2009
j...@cosmicnetworks.net:/usr/local/build

Ahem.  You're running a non-standard kernel and yet don't call that
out or describe what the difference is between it and the standard
GENERIC.MP config is?  Ah well, I guess I've typed this far, so I'll
send this response instead of deleting it...


Philip Guenther



Re: uvm_fault, possibly fstat(1) related

2009-12-27 Thread John Cosimano
--- Philip Guenther [Sun, Dec 27, 2009 at 03:46:33PM -0800]: --- 
 On Sun, Dec 27, 2009 at 10:25 AM, John Cosimano j...@cosmicnetworks.net 
 wrote:
 ...
  i saw in the archive that Stuart Henderson posted something similar a
  while back, but didn't see any follow-up. any hints to get me looking in
  the right direction?
 
 You have a reference to his post handy?

it was on tech@ now that i recall, which probably even makes it
irrelevant for having mentioned here in the first place. if it's not too
OT to mention here:

http://archives.neohapsis.com/archives/openbsd/2009-07/1051.html

  i'm running 4.6-STABLE, SMP kernel. dmesg attached:
 
  OpenBSD 4.6 (build) #5: Thu Nov 12 10:28:47 EST 2009
  ? ?...@cosmicnetworks.net:/usr/local/build
 
 Ahem.  You're running a non-standard kernel and yet don't call that
 out or describe what the difference is between it and the standard
 GENERIC.MP config is?  Ah well, I guess I've typed this far, so I'll
 send this response instead of deleting it...

sorry for not stating. i run an unmodified GENERIC.MP, i just build
outside of /usr/src when i apply patches that require a kernel rebuild.

shant happen again, sir.



Re: uvm_fault, possibly fstat(1) related

2009-12-27 Thread Brynet
 i'm running 4.6-STABLE, SMP kernel. dmesg attached:
 
 OpenBSD 4.6 (build) #5: *Thu Nov 12 10:28:47 EST 2009*

Update your tree, rebuild the kernel.

There were a few cases where fstat(1) would crash my systems as well.

http://marc.info/?l=openbsd-cvsm=125971073002018w=2
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/kern/kern_sysctl.c#rev1.175.4.1

It appears to fix things, was commited on December 1st.

-Bryan.

--- src/sys/kern/kern_sysctl.c  2009/06/15 18:59:45 1.175
+++ src/sys/kern/kern_sysctl.c  2009/12/01 23:37:51 1.175.4.1
@@ -1,4 +1,4 @@
-/* $OpenBSD: kern_sysctl.c,v 1.175 2009/06/15 17:59:45 deraadt Exp $   
*/
+/* $OpenBSD: kern_sysctl.c,v 1.175.4.1 2009/12/01 23:37:51 william Exp $   
*/
 /* $NetBSD: kern_sysctl.c,v 1.17 1996/05/20 17:49:05 mrg Exp $ */

 /*-
@@ -1086,6 +1086,8 @@ fill_file2(struct kinfo_file2 *kf, struct file *fp, st
kf-so_pcb = PTRTOINT64(so-so_pcb);
kf-so_protocol = so-so_proto-pr_protocol;
kf-so_family = so-so_proto-pr_domain-dom_family;
+   if (!so-so_pcb)
+   break;
switch (kf-so_family) {
case AF_INET: {
struct inpcb *inpcb = so-so_pcb;