On 07/08/10 02:34, Richard Toohey wrote:
On 8/07/2010, at 2:45 PM, Daniel Melameth wrote:

On Fri, Jun 18, 2010 at 11:08 PM, Daniel Melameth<dan...@melameth.com>
wrote:
On my firewall at home, on occasion, running systat queues leaves me with
an
unresponsive system.  pings are not returned and the keyboard at the
console
is unresponsive.  Sometimes the command works fine and sometimes it does
not--though it does seem the issue is more likely to occur when the system
has an uptime of more than a week or two.  I'm uncertain how to
troubleshoot
this further and I have been unable to reproduce the issue on other
4.7-stable systems (though these other systems are not running the same
hardware and software).
I upgraded the system several days ago to a snapshot from just before
the hackathon, and the system appeared more stable, but I can now also
instantly kill the box by running netstat -m after about five days of
uptime.

Ideas appreciated...

Hardware?

Tried different NICs?  RAM?  Put the HD in another machine?

No-one else seems to be seeing this (or reporting it) and you can't
reproduce on other machines, so worth eliminating hardware.

Anything unusual or different about this machine or what you run on it?


I said much the same thing to Daniel off-list when he first posted almost two weeks ago, suggesting he try both a new snapshot (at the time) and trying another after the hackathon.

Interestingly, since then I've installed the June 23rd snapshot (and built to -current on June 27th) and guess what? Sporadic freezes under different circumstances, none of which are the same as Daniel's (netstat -m seems to work fine for me.) When I say freeze, I mean locked up hard: no mouse, no keyboard, no pings, nothing; I have to power cycle it.

Two freezes have occurred when I wasn't using the system locally, just watching movies (on another PC) using Samba. One freeze when I was reading my mail locally (like now), but an ssh network backup was taking place from /etc/daily.local.

I'll be trying a newer snap this weekend (or before) and see how things go. This is using the same hardware and same setup that has been fine for almost two years (except a new HDD from Nov/09), so I seriously doubt it's hardware. Three "random" freezes in a week and a half when it's never happened on this hardware before, ever. My previous install was running -current from early(?) May.

Sorry for the completely vague message, I know it won't help anyone debug anything. The problem can't be reproduced, but I'm guessing some networking changes have happened that are affecting Daniel and myself.

I'm only posting this in case there are other lurkers that this is happening to, who haven't mentioned anything because there just aren't any leads to go on.

So, anyone else having mysterious intermittent lockups when the network is in use?

Dmesg & processes: (the unmounted warning is from the last time it froze up, 27h ago)

OpenBSD 4.7-current (GENERIC.MP) #0: Sun Jun 27 01:54:59 EDT 2010
    r...@blackstaff.erratic.ca:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel(R) Pentium(R) 4 CPU 3.20GHz ("GenuineIntel" 686-class) 3.20 GHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SS
E,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,EST,CNXT-ID,CX16,xTPR
real mem  = 1061974016 (1012MB)
avail mem = 1035464704 (987MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 02/14/06, BIOS32 rev. 0 @ 0xfd61a, SMBIOS rev. 2.34 @ 0xf0320 (59
entries)
bios0: vendor IBM version "2EKT33AUS" date 02/14/2006
bios0: IBM 8215W97
acpi0 at bios0: rev 0
acpi0: tables DSDT FACP TCPA MCFG APIC BOOT ASF! SSDT
acpi0: wakeup devices AZAL(S3) EXP0(S5) EXP1(S5) USB1(S3) USB2(S3) USB3(S3) USB4(S3) USBE(S3) SLOT(S5) K
BC_(S3) PSM_(S3) COMA(S5) COMB(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 199MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Pentium(R) 4 CPU 3.20GHz ("GenuineIntel" 686-class) 3.20 GHz
cpu1: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SS
E,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,EST,CNXT-ID,CX16,xTPR
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 2 (EXP0)
acpiprt3 at acpi0: bus 4 (EXP1)
acpiprt4 at acpi0: bus 10 (SLOT)
acpicpu0 at acpi0: FVS, 1600, 1400 MHz
acpicpu1 at acpi0: FVS, 1600, 1400 MHz
acpitz0 at acpi0: critical temperature 255 degC
acpibtn0 at acpi0: PWRB
bios0: ROM list: 0xc0000/0xac00! 0xcb000/0x1000 0xcc000/0x1000 0xcd000/0x800 0xe0000/0x1800!
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82945G Host" rev 0x02
vga1 at pci0 dev 2 function 0 "Intel 82945G Video" rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
intagp0 at vga1
agp0 at intagp0: aperture at 0xc0000000, size 0x10000000
inteldrm0 at vga1: apic 2 int 16 (irq 5)
drm0 at inteldrm0
"Intel 82945G Video" rev 0x02 at pci0 dev 2 function 1 not configured
azalia0 at pci0 dev 27 function 0 "Intel 82801GB HD Audio" rev 0x01: apic 2 int 22 (irq 9)
azalia0: codecs: Analog Devices AD1981HD
audio0 at azalia0
ppb0 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: apic 2 int 17 (irq 11)
pci1 at ppb0 bus 2
em0 at pci1 dev 0 function 0 "Intel PRO/1000MT (82573E)" rev 0x03: apic 2 int 16 (irq 5), address 00:10:
c6:b5:c1:72
ppb1 at pci0 dev 28 function 1 "Intel 82801GB PCIE" rev 0x01: apic 2 int 16 (irq 5)
pci2 at ppb1 bus 4
uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 2 int 23 (irq 11) uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 2 int 19 (irq 9) uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 2 int 18 (irq 10) uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 2 int 16 (irq 5) ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 2 int 23 (irq 11)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb2 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xe1
pci3 at ppb2 bus 10
xl0 at pci3 dev 9 function 0 "3Com 3c905C 100Base-TX" rev 0x74: apic 2 int 21 (irq 7), address 00:01:02:
3b:a6:d1
bmtphy0 at xl0 phy 24: 3C905C internal PHY, rev. 6
ichpcib0 at pci0 dev 31 function 0 "Intel 82801GB LPC" rev 0x01: PM disabled
pciide0 at pci0 dev 31 function 2 "Intel 82801GB SATA" rev 0x01: DMA, channel 0 wired to compatibility,
channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: <ST31500341AS>
wd0: 16-sector PIO, LBA48, 1430799MB, 2930277168 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 6
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <HL-DT-ST, DVDRAM GSA-4163B, AX13> ATAPI 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 4
ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01: apic 2 int 19 (irq 9)
iic0 at ichiic0
adt0 at iic0 addr 0x2e: lm96000 rev 0x68
spdmem0 at iic0 addr 0x50: 512MB DDR2 SDRAM non-parity PC2-4200CL5
spdmem1 at iic0 addr 0x52: 512MB DDR2 SDRAM non-parity PC2-5300CL5
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb4 at uhci3: USB revision 1.0
uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at ichpcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
mtrr: Pentium Pro MTRR support
umass0 at uhub0 port 6 configuration 1 interface 0 " USB Flash Memory" rev 2.00/1.10 addr 2
umass0: using SCSI over Bulk-Only
scsibus1 at umass0: 2 targets, initiator 0
sd0 at scsibus1 targ 1 lun 0: <, USB Flash Memory, PMAP> SCSI0 0/direct removable
sd0: 15306MB, 512 bytes/sec, 31346688 sec total
uhub5 at uhub4 port 1 "Dell Dell USB Keyboard Hub" rev 1.10/2.00 addr 2
uhidev0 at uhub5 port 1 configuration 1 interface 0 "Dell Dell USB Keyboard" rev 1.10/2.00 addr 3
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 modifier keys, 6 key codes
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub5 port 1 configuration 1 interface 1 "Dell Dell USB Keyboard" rev 1.10/2.00 addr 3
uhidev1: iclass 3/0, 3 report ids
uhid0 at uhidev1 reportid 1: input=1, output=0, feature=0
uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0
uhid2 at uhidev1 reportid 3: input=3, output=0, feature=0
uhidev2 at uhub4 port 2 configuration 1 interface 0 "Logitech USB Optical Mouse" rev 2.00/43.01 addr 4
uhidev2: iclass 3/1
ums0 at uhidev2: 3 buttons, Z dir
wsmouse0 at ums0 mux 0
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
root on wd0a swap on wd0b dump on wd0b
WARNING: / was not properly unmounted


$ ps -ax
  PID TT  STAT       TIME COMMAND
    1 ??  Is      0:00.01 /sbin/init
22782 ??  Is      0:00.00 syslogd: [priv] (syslogd)
17273 ?? I 0:00.16 syslogd -a /var/spool/postfix/dev/log -a /var/web/www0/dev/log -a /var/web/ww
14610 ??  Is      0:00.00 pflogd: [priv] (pflogd)
2582 ?? S 0:04.10 pflogd: [running] -s 116 -i pflog0 -f /var/log/pflog (pflogd)
26996 ??  Is      0:00.11 ntpd: ntp engine (ntpd)
 6476 ??  Is      0:00.00 ntpd: dns engine (ntpd)
 7663 ??  Is      0:00.01 ntpd: [priv] (ntpd)
28240 ??  Is      0:00.01 /usr/sbin/sshd
 6442 ??  Is      0:00.01 /usr/libexec/ftpd -D4llUSA
23944 ??  Is      0:00.01 /sbin/sshd
 5546 ??  Is      0:00.03 /usr/local/libexec/smbd
28105 ??  Ss      0:00.99 /usr/local/libexec/nmbd
 3760 ??  Ss      0:00.44 /usr/local/libexec/postfix/master
27995 ??  I       0:00.08 qmgr -l -t fifo -u -c
28216 ??  I       0:00.00 /usr/local/libexec/smbd
24650 ??  Ss      0:14.06 postgres: writer process    (postgres)
 9166 ??  Ss      0:09.04 postgres: wal writer process    (postgres)
12503 ?? Ss 0:03.53 postgres: autovacuum launcher process (postgres)
24201 ??  Ss      0:02.98 postgres: stats collector process    (postgres)
24841 ??  Ss      0:06.02 httpd4: parent [chroot /var/web/www4] (httpd4)
30352 ??  Ss      0:00.12 cron
27201 ??  Is      0:00.01 /usr/X11R6/bin/xdm
24112 ?? Ss 13:58.04 /usr/X11R6/bin/X :0 vt05 -auth /etc/X11/xdm/authdir/authfiles/A:0-jXPDMx (Xor
 5106 ??  I       0:00.03 X: [priv] (Xorg)
15015 ??  Is      0:00.31 xdm: :0 (xdm)
30739 ??  I       0:00.00 /bin/sh -c /bin/sh
28938 ?? I 0:00.01 /usr/local/sbin/cronolog /var/web/www4/logs/error_log_kemcha-%Y%m%d
15243 ??  I       0:00.00 /bin/sh -c /bin/sh
1561 ?? I 0:00.19 /usr/local/sbin/cronolog /var/web/www4/logs/access_log_kemcha-%Y%m%d
22160 ??  I       0:07.51 httpd4: child (httpd4)
 6064 ??  I       0:08.14 httpd4: child (httpd4)
27642 ??  I       0:06.26 httpd4: child (httpd4)
16014 ??  I       0:06.90 httpd4: child (httpd4)
26422 ??  Is      0:00.01 /bin/sh /etc/X11/xdm/Xsession
15112 ??  Is      0:00.25 ssh-agent -s
27060 ??  I       0:00.26 xterm
 1001 ??  S       0:16.77 scrotwm
 6492 ??  S       0:02.61 xterm
31285 ??  S       3:24.82 /bin/sh /usr/local/share/scrotwm/baraction.sh
25777 ?? Is 0:00.01 /bin/sh /usr/local/bin/firefox /usr/local/bin/firefox -no-remote 31051 ?? I 0:00.01 /bin/sh /usr/local/mozilla-firefox/run-mozilla.sh /usr/local/mozilla-firefox/
 3855 ??  S     118:32.08 /usr/local/mozilla-firefox/firefox-bin
16784 ??  Is      0:00.01 /bin/sh /usr/local/bin/thunderbird
4964 ?? I 0:00.01 /bin/sh /usr/local/mozilla-thunderbird/run-mozilla.sh /usr/local/mozilla-thun
18169 ??  S       5:04.88 /usr/local/mozilla-thunderbird/thunderbird-bin
23832 ??  Is      0:00.06 sshd: scott [priv] (sshd)
26546 ??  I       0:00.02 sshd: sc...@ttyp2 (sshd)
 7198 ??  Is      0:00.24 xterm &
24631 ??  Ss      0:00.98 xterm &
 2879 ??  Is      0:00.09 xterm &
 2594 ??  Is      0:00.01 postgres: kemcha kemcha [local] idle (postgres)
22775 ??  S       0:00.05 pickup -l -t fifo -u -c
25067 ??  S       0:00.40 /usr/sbin/iostat -C -c 3600
26334 p0  Is      0:00.01 -ksh (ksh)
23597 p0  I+      0:00.02 ksh
20836 p1  Is      0:00.03 -ksh (ksh)
 2434 p1  I+      0:00.10 vi functions.php
17039 p2  Is      0:00.01 -ksh (ksh)
13584 p2 I+ 0:00.00 tail -f /var/web/www4/logs/error_log_kemcha-20100708
25340 p3  Is      0:00.02 -ksh (ksh)
17805 p3  I+      0:00.04 psql
26679 p4  Ss      0:00.07 -ksh (ksh)
 3585 p4  R+/0    0:00.00 ps -ax
10219 p5  Is      0:00.01 -ksh (ksh)
31488 p5  I+      0:00.03 vi TODO
31768 C0- I 0:01.94 /usr/local/bin/postgres -D /var/postgresql/data -D /var/postgresql/data
26593 C0- I       0:00.00 /bin/sh /command/svscanboot
13614 C0- S       0:02.05 svscan /service
29736 C0- I 0:00.00 readproctitle service errors: ...............................................
19545 C0- I       0:00.01 supervise tinydns
20036 C0- I       0:00.01 supervise log
 1134 C0- I       0:00.01 supervise dnscache
13632 C0- I       0:00.01 supervise log
25581 C0- I       0:00.02 multilog t /var/log/tinydns
30329 C0- I       0:03.95 /usr/local/bin/dnscache
15770 C0- I       0:00.05 /usr/local/bin/tinydns
 1893 C0- I       0:02.01 multilog t /var/log/dnscache
27791 C0- I 0:00.00 /usr/local/sbin/cronolog /var/web/www4/logs/error_log_kemcha-%Y%m%d
10858 C0  Is+     0:00.00 /usr/libexec/getty std.9600 ttyC0

--
- RSM
www.erratic.ca

Reply via email to