system crash tickled by 450.status-security (fwd)

2004-01-28 Thread John Uhlig

We are running FreeBSD 4.9 on 2 Dell poweredge 2650's as fileservers
each with 1 TB of RAID disk file space. Both crash and reboot every few
days at approx. 3:15AM. It appears that the systems are running /etc/periodic/
daily/450.status-security script when the crash occurs. Running the daily
cronjobs more frequently induces the crash more often.

We have a kernel core dump and have included some of the gdb output
below. I would appreciate any pointers or suggestions that can help
us resolve this problem.

thanks,
John Uhlig

===
uname output

platoon# uname -a
FreeBSD platoon.parc.xerox.com 4.9-RELEASE-p1 FreeBSD 4.9-RELEASE-p1 #0: 
Wed Jan 28 08:45:33 PST 2004 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/PARCGBNIC.debg  i386

=
initial gdb output
==
SMP 4 cpus
IdlePTD at phsyical address 0x0051f000
initial pcb at physical address 0x0044e560
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 0002; cpuid = 0; lapic.id = 
fault virtual address   = 0xbfc0
fault code  = supervisor write, page not present
instruction pointer = 0x8:0xc0356149
stack pointer   = 0x10:0xffbe1e04
frame pointer   = 0x10:0xffbe1e10
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 701 (sed)
interrupt mask  = none - SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0002; cpuid = 0; lapic.id = 
boot() called on cpu#0

syncing disks... 52 
done
Uptime: 20m43s
amr0: flushing cache...done

dumping to dev #aacd/0x40001, offset 5243136

===
List code at instruction pointer address

(kgdb) list *0xc0356149
0xc0356149 is in pmap_qenter (/usr/src/sys/i386/i386/pmap.c:848).
843 void
844 pmap_qenter(vm_offset_t va, vm_page_t *m, int count)
845 {
846 while (count--  0) {
847 pt_entry_t *pte = vtopte(va);
848 *pte = VM_PAGE_TO_PHYS(*m) | PG_RW | PG_V | 
pgeflag;
849 #ifdef SMP
850 cpu_invlpg((void *)va);
851 #else
852 invltlb_1pg(va);

=
backtrace
=
(kgdb) backtrace
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc01d85c3 in boot (howto=256) at 
/usr/src/sys/kern/kern_shutdown.c:316
#2  0xc01d8a1c in poweroff_wait (junk=0xc03d3819, howto=-1069731121)
at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc035a4d8 in trap_fatal (frame=0xffbe1dc4, eva=3217031168)
at /usr/src/sys/i386/i386/trap.c:974
#4  0xc035a169 in trap_pfault (frame=0xffbe1dc4, usermode=0, 
eva=3217031168)
at /usr/src/sys/i386/i386/trap.c:867
#5  0xc0359cdb in trap (frame={tf_fs = 24, tf_es = -67108848, tf_ds = 
134545424, 
  tf_edi = -67978584, tf_esi = 0, tf_ebp = -4317680, tf_isp = 
-4317712, tf_ebx = 3, 
  tf_edx = -1043397044, tf_ecx = 0, tf_eax = 1122230275, tf_trapno = 
12, tf_err = 2, 
  tf_eip = -1070243511, tf_cs = 8, tf_eflags = 66054, tf_esp = 
134606848, 
  tf_ss = 134606848}) at /usr/src/sys/i386/i386/trap.c:466
#6  0xc0356149 in pmap_qenter (va=0, m=0xfbf2baa8, count=4)
at /usr/src/sys/i386/i386/pmap.c:848
#7  0xc01e91fe in pipe_build_write_buffer (wpipe=0xfbf2ba80, 
uio=0xffbe1ed0)
at /usr/src/sys/kern/sys_pipe.c:594
#8  0xc01e93c4 in pipe_direct_write (wpipe=0xfbf2ba80, uio=0xffbe1ed0)
at /usr/src/sys/kern/sys_pipe.c:709
#9  0xc01e9766 in pipe_write (fp=0xcb801000, uio=0xffbe1ed0, 
cred=0xc875cc00, flags=0, 
p=0xfc001080) at /usr/src/sys/kern/sys_pipe.c:827
#10 0xc01e7ae9 in dofilewrite (p=0xfc001080, fp=0xcb801000, fd=1, 
buf=0x805b000, 
nbyte=16384, offset=-1, flags=0) at /usr/src/sys/sys/file.h:163
#11 0xc01e79a2 in write (p=0xfc001080, uap=0xffbe1f80)
at /usr/src/sys/kern/sys_generic.c:329
#12 0xc035a809 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
tf_edi = 134590464, 
  tf_esi = 672071960, tf_ebp = -1077937248, tf_isp = -4317228, tf_ebx 
= 672072428, 
  tf_edx = 672071960, tf_ecx = 0, tf_eax = 4, tf_trapno = 7, tf_err = 
2, 
  tf_eip = 672025636, tf_cs = 31, tf_eflags = 663, tf_esp = 
-1077937292, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:1175
#13 0xc034517b in Xint0x80_syscall ()
#14 0x280e2902 in ?? ()
#15 0x280e2871 in ?? ()
#16 0x280df756 in ?? ()
#17 0x28088fb5 in ?? ()
#18 0x804b81f in ?? ()
#19 0x804a926 in ?? ()
#20 0x8048f96 in ?? ()


Re: system crash tickled by 450.status-security (fwd)

2004-01-28 Thread Jorn Argelo
Well, I recall FreeBSD 5.1 having problems with the RAID controller
that is being used by the PE 2650 (a DELL PERC 3/Di or something
wasn't it?). I don't know how it is with 4.9 though, never tried that.
We were using Nagios and MRTG on that Box, which is a monitoring tool.
And well, it had to get about 5 or 6 SNMP checks plus several port
checks from about 175 servers, so it had quite a load. Thus it
resulted into a complete system crash frequently.
Unfortunately I can't give you a real solution. The funny thing was,
I tried upgrading it to FreeBSD 5.1-CURRENT but that wasn't working
at all. So I reinstalled it again to RELEASE, recompiled the kernel
with the same configuration file as I did with the previous one, and
suddenly it was all fine. It has an uptime from 31 days now.
I know this message isn't going to help you too much, but I thought
it might be handy to know that you were not the only one having
problems with the Dell PowerEdge 2650.
Cheers,

Jorn

On Wed, 28 Jan 2004 13:27:29 PST, John Uhlig [EMAIL PROTECTED] wrote:

We are running FreeBSD 4.9 on 2 Dell poweredge 2650's as fileservers
each with 1 TB of RAID disk file space. Both crash and reboot every few
days at approx. 3:15AM. It appears that the systems are running 
/etc/periodic/
daily/450.status-security script when the crash occurs. Running the daily
cronjobs more frequently induces the crash more often.

We have a kernel core dump and have included some of the gdb output
below. I would appreciate any pointers or suggestions that can help
us resolve this problem.
thanks,
John Uhlig
===
uname output

platoon# uname -a
FreeBSD platoon.parc.xerox.com 4.9-RELEASE-p1 FreeBSD 4.9-RELEASE-p1 #0:
Wed Jan 28 08:45:33 PST 2004
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/PARCGBNIC.debg  i386
=
initial gdb output
==
SMP 4 cpus
IdlePTD at phsyical address 0x0051f000
initial pcb at physical address 0x0044e560
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 0002; cpuid = 0; lapic.id = 
fault virtual address   = 0xbfc0
fault code  = supervisor write, page not present
instruction pointer = 0x8:0xc0356149
stack pointer   = 0x10:0xffbe1e04
frame pointer   = 0x10:0xffbe1e10
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 701 (sed)
interrupt mask  = none - SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0002; cpuid = 0; lapic.id = 
boot() called on cpu#0
syncing disks... 52
done
Uptime: 20m43s
amr0: flushing cache...done
dumping to dev #aacd/0x40001, offset 5243136

===
List code at instruction pointer address

(kgdb) list *0xc0356149
0xc0356149 is in pmap_qenter (/usr/src/sys/i386/i386/pmap.c:848).
843 void
844 pmap_qenter(vm_offset_t va, vm_page_t *m, int count)
845 {
846 while (count--  0) {
847 pt_entry_t *pte = vtopte(va);
848 *pte = VM_PAGE_TO_PHYS(*m) | PG_RW | PG_V |
pgeflag;
849 #ifdef SMP
850 cpu_invlpg((void *)va);
851 #else
852 invltlb_1pg(va);
=
backtrace
=
(kgdb) backtrace
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc01d85c3 in boot (howto=256) at
/usr/src/sys/kern/kern_shutdown.c:316
#2  0xc01d8a1c in poweroff_wait (junk=0xc03d3819, howto=-1069731121)
at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc035a4d8 in trap_fatal (frame=0xffbe1dc4, eva=3217031168)
at /usr/src/sys/i386/i386/trap.c:974
#4  0xc035a169 in trap_pfault (frame=0xffbe1dc4, usermode=0,
eva=3217031168)
at /usr/src/sys/i386/i386/trap.c:867
#5  0xc0359cdb in trap (frame={tf_fs = 24, tf_es = -67108848, tf_ds =
134545424,
  tf_edi = -67978584, tf_esi = 0, tf_ebp = -4317680, tf_isp =
-4317712, tf_ebx = 3,
  tf_edx = -1043397044, tf_ecx = 0, tf_eax = 1122230275, tf_trapno =
12, tf_err = 2,
  tf_eip = -1070243511, tf_cs = 8, tf_eflags = 66054, tf_esp =
134606848,
  tf_ss = 134606848}) at /usr/src/sys/i386/i386/trap.c:466
#6  0xc0356149 in pmap_qenter (va=0, m=0xfbf2baa8, count=4)
at /usr/src/sys/i386/i386/pmap.c:848
#7  0xc01e91fe in pipe_build_write_buffer (wpipe=0xfbf2ba80,
uio=0xffbe1ed0)
at /usr/src/sys/kern/sys_pipe.c:594
#8  0xc01e93c4 in pipe_direct_write (wpipe=0xfbf2ba80, uio=0xffbe1ed0)
at