At 09:14 AM 5/5/2006, you wrote:
Hello,

We have a group of web and mail servers that run under a moderate load. We recently upgraded them from 4/5.x to 6.0. While we thought we had done enough testing, apparently we hadn't and are now experiencing panic's on a number of the servers. Some of our more heavily loaded servers have been fine for days, while others will crash every 6 to 36 hours. Below are some pieces of information that may be helpful.

Should I be posting this to another list as well?

I know I can decrease NMBCLUSTERS dramatically, and give more memory to the kernel if that would help.

I've read a number of similar cases where this panic was related to a hardware failure, and while I can't rule that out completely, it does seem unusual that several servers are apparently having the same problem. Could it be that hardware problems existed before the upgrade, but are now brought out by the increased load caused by the new OS version and other installed software? We have IPMI cards in some of the crashing servers and they all report normal temperatures, fan speeds, and voltages. Nothing unusual in the event logs.

I'm willing to dig deeper and do more testing if anyone has suggestions.

As suggested, we have upgraded to 6.1-RC2 and are experiencing the same or a very similar panic. Some things have changed on the system, so I'll re-post our config, backtrace and dmesg again.

Also, in the past I have seen kgdb report the process that caused the panic, as well as some other information when first loaded, but sometimes it doesn't show that information - am I doing something wrong there? It has shown that information before, and it has always been tcpserver from the ucspi-tcp-0.88_2 port.


Differences from the 6.1-RC2 GENERIC kernel config
-----------------------------------
#cpu            I486_CPU
cpu             I586_CPU
cpu             I686_CPU
ident           MAIL_6_1

options         SUIDDIR
options         QUOTA
options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_VERBOSE_LIMIT=10
options         NMBCLUSTERS=65536
options         KVA_PAGES="640"
options         VM_KMEM_SIZE_MAX=(512*1048576)
options         VM_KMEM_SIZE_SCALE=2

options         ASR_COMPAT

options         SHMMAXPGS=131072
options         SEMMNI=128
options         SEMMNS=512
options         SEMUME=100
options         SEMMNU=256
-----------------------------------

-----------------------------------
mail-da-5# kgdb /boot/kernel/kernel.debug vmcore.8
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:


#0  doadump () at pcpu.h:165
165     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0x6064e239 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402
#2 0x6064e4d0 in panic (fmt=0x60894857 "%s") at /usr/src/sys/kern/kern_shutdown.c:558 #3 0x608496d4 in trap_fatal (frame=0x9c8f7ad8, eva=172) at /usr/src/sys/i386/i386/trap.c:836 #4 0x6084943b in trap_pfault (frame=0x9c8f7ad8, usermode=0, eva=172) at /usr/src/sys/i386/i386/trap.c:744
#5  0x60849079 in trap (frame=
{tf_fs = 1619263496, tf_es = 1627652136, tf_ds = 40, tf_edi = 55, tf_esi = 0, tf_ebp = -1668318412, tf_isp = -1668318460, tf_ebx = -1668318064, tf_edx = 1738397568, tf_ecx = 0, tf_eax = 4, tf_trapno = 12, tf_err = 2, tf_eip = 1617891744, tf_cs = 32, tf_eflags = 66182, tf_esp = 1835631104, tf_ss = 0})
    at /usr/src/sys/i386/i386/trap.c:434
#6  0x6083890a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0x606f11a0 in ip_ctloutput (so=0x4, sopt=0x9c8f7c90) at atomic.h:146
#8 0x6070123b in tcp_ctloutput (so=0x66b34858, sopt=0x9c8f7c90) at /usr/src/sys/netinet/tcp_usrreq.c:1038 #9 0x60687c04 in sosetopt (so=0x66b34858, sopt=0x9c8f7c90) at /usr/src/sys/kern/uipc_socket.c:1560 #10 0x6068ce95 in kern_setsockopt (td=0x679dd780, s=0, level=4, name=4, val=0x679dd780, valseg=UIO_USERSPACE,
    valsize=0) at /usr/src/sys/kern/uipc_syscalls.c:1351
#11 0x6068cdc6 in setsockopt (td=0x679dd780, uap=0x4) at /usr/src/sys/kern/uipc_syscalls.c:1307
#12 0x608499eb in syscall (frame=
{tf_fs = 1606352955, tf_es = 59, tf_ds = 1606352955, tf_edi = 1606413432, tf_esi = 3, tf_ebp = 1606413224, tf_isp = -1668317852, tf_ebx = 0, tf_edx = 2, tf_ecx = 134545464, tf_eax = 105, tf_trapno = 12, tf_err = 2, tf_eip = 672065711, tf_cs = 51, tf_eflags = 514, tf_esp = 1606413180, tf_ss = 59})
    at /usr/src/sys/i386/i386/trap.c:981
#13 0x6083895f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
#14 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) q
-----------------------------------


-----------------------------------
mail-da-5# dmesg
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RC2 #0: Fri May  5 07:27:28 MDT 2006
    [EMAIL PROTECTED]:/usr/obj/usr/src/sys/LOCAL
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2399.33-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf27  Stepping = 7
  
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x4400<CNTX-ID,<b14>>
  Logical CPUs per core: 2
real memory  = 2146959360 (2047 MB)
avail memory = 2093965312 (1996 MB)
MPTable: <  Kings Canyon>
ioapic0: Assuming intbase of 0
ioapic1: Assuming intbase of 24
ioapic2: Assuming intbase of 48
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
kbd1 at kbdmux0
rr232x: RocketRAID 232x controller driver v1.02 (May  5 2006 07:26:59)
cpu0 on motherboard
pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
pci0: <unknown> at device 0.1 (no driver attached)
pcib1: <PCI-PCI bridge> at device 2.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <base peripheral, interrupt controller> at device 28.0 (no driver attached)
pcib2: <MPTable PCI-PCI bridge> at device 29.0 on pci1
pci2: <PCI bus> on pcib2
em0: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port 0x3000-0x303f mem 0xf8200000-0xf821ffff irq 54 at device 3.0 on pci2
em0: Ethernet address: 00:30:48:28:78:fe
em1: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port 0x3040-0x307f mem 0xf8220000-0xf823ffff irq 55 at device 3.1 on pci2
em1: Ethernet address: 00:30:48:28:78:ff
pci1: <base peripheral, interrupt controller> at device 30.0 (no driver attached)
pcib3: <MPTable PCI-PCI bridge> at device 31.0 on pci1
pci3: <PCI bus> on pcib3
asr0: <Adaptec Caching SCSI RAID> mem 0xf8300000-0xf83fffff,0xfb000000-0xfbffffff,0xfc000000-0xfdffffff irq 30 at device 3.0 on pci3
asr0: [GIANT-LOCKED]
asr0: ADAPTEC 2015S FW Rev. 3B05, 2 channel, 256 CCBs, Protocol I2O
uhci0: <Intel 82801CA/CAM (ICH3) USB controller USB-A> port 0x2000-0x201f irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <Intel 82801CA/CAM (ICH3) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801CA/CAM (ICH3) USB controller USB-B> port 0x2020-0x203f irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <Intel 82801CA/CAM (ICH3) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801CA/CAM (ICH3) USB controller USB-C> port 0x2040-0x205f irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <Intel 82801CA/CAM (ICH3) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
pcib4: <MPTable PCI-PCI bridge> at device 30.0 on pci0
pci4: <PCI bus> on pcib4
pci4: <display, VGA> at device 1.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH3 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x2060-0x206f at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xcefff,0xe0000-0xe3fff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0c02> can't assign resources (memory)
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0c02> can't assign resources (memory)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
Timecounter "TSC" frequency 2399330476 Hz quality 800
Timecounters tick every 1.000 msec
ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging limited to 10 packets/entry by default
rr232x: no controller detected.
acd0: CDROM <MATSHITA CR-177/7T0D> at ata1-master UDMA33
ses0 at asr0 bus 0 target 6 lun 0
ses0: <SUPER GEM318 0> Fixed Processor SCSI-2 device
ses0: SAF-TE Compliant Device
da0 at asr0 bus 0 target 0 lun 0
da0: <SEAGATE ST336706LC 4101> Fixed Direct Access SCSI-4 device
da0: Tagged Queueing Enabled
da0: 34687MB (71041007 512 byte sectors: 255H 63S/T 4422C)
da1 at asr0 bus 0 target 1 lun 0
da1: <ADAPTEC RAID-1 3B05> Fixed Direct Access SCSI-2 device
da1: Tagged Queueing Enabled
da1: 140014MB (286748672 512 byte sectors: 255H 63S/T 17849C)
Trying to mount root from ufs:/dev/da0s1a
-----------------------------------

Any help will be greatly appreciated.

Nick Wood
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to