Re: FreeBSD -STABLE servers repeatedly crashing.
Gary Mulder wrote on 2005-07-18 23:39: From personal experience I can repeat what Matt has stated. It seems to be related to what NIC you have. I have had crashes with fxp (Intel Pro 100MBit) and bge (Broadcom Gigabit) NICs under moderate network load. It seems not. I had crashes with fxp, xl, bge and em, IIRC. Removing ipf reduced but did not eliminate the crashes. Removing IPF elimitated crashes in every case of my SMP boxes. -- * Maciej Wierzbicki * At paranoia's poison door * * VOO1-RIPE VOO1-6BONE * ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, Jul 20, 2005 at 03:57:57PM +0200, Maciej Wierzbicki wrote: Gary Mulder wrote on 2005-07-18 23:39: From personal experience I can repeat what Matt has stated. It seems to be related to what NIC you have. I have had crashes with fxp (Intel Pro 100MBit) and bge (Broadcom Gigabit) NICs under moderate network load. It seems not. I had crashes with fxp, xl, bge and em, IIRC. Removing ipf reduced but did not eliminate the crashes. Removing IPF elimitated crashes in every case of my SMP boxes. Folks, you're talking about different things: * Panics with IPF enabled on SMP and any network card (network card is not relevant for this problem, which is IPF). This problem is understood, and the only current solution is 'don't use both of SMP and IPF'. * What Gary is talking about, which are apparently panics without IPF enabled on several NICs. Since this is a new problem, Gary needs to do some additional diagnosis work so that someone can investigate them. Let's try to keep the issue clear :-) Kris pgpewNrzjSGb5.pgp Description: PGP signature
Re: FreeBSD -STABLE servers repeatedly crashing.
On Mon, Jul 18, 2005 at 05:39:40PM -0400, Gary Mulder wrote: On Mon, 18 Jul 2005, Pawel Malachowski wrote: On Mon, Jul 18, 2005 at 04:09:58PM -0400, Matt Juszczak wrote: Correct. IPF is unstable with our SMP (most of the time) - based 5.x boxes. VERY unstable. VERY VERY unstable. Hm, this sounds bad. What is debug.mpsafenet set to? How big is traffic? I have one SMP box with ipnat, routing some megabits (even during night it's more than 30-40Mbps) without problems, however, ipnat is used only for very small group of hosts right now. But we plan to use ipnat more heavily so it sounds a bit scary. ;) From personal experience I can repeat what Matt has stated. It seems to be related to what NIC you have. I have had crashes with fxp (Intel Pro 100MBit) and bge (Broadcom Gigabit) NICs under moderate network load. Removing ipf reduced but did not eliminate the crashes. debug.mpsafe also reduced but did not eliminate the crashes. No, that's different then. Please report your bugs in the usual way (gdb traceback, etc). Kris pgp7c9XhqJG0U.pgp Description: PGP signature
Re: FreeBSD -STABLE servers repeatedly crashing.
On Jul 18, 2005, at 5:39 PM, Gary Mulder wrote: Another person on the freebsd-amd64 list reported similar network- related crashes until he switched to em (Intel Gigabit Ethernet) NICs. that was probably me... but I don't have any firewall on these boxes as they are not hooked up to the internet -- just internal back-end DB servers. Vivek Khera, Ph.D. +1-301-869-4449 x806 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Mon, 18 Jul 2005, Pawel Malachowski wrote: On Mon, Jul 18, 2005 at 04:09:58PM -0400, Matt Juszczak wrote: Correct. IPF is unstable with our SMP (most of the time) - based 5.x boxes. VERY unstable. VERY VERY unstable. Hm, this sounds bad. What is debug.mpsafenet set to? How big is traffic? I have one SMP box with ipnat, routing some megabits (even during night it's more than 30-40Mbps) without problems, however, ipnat is used only for very small group of hosts right now. But we plan to use ipnat more heavily so it sounds a bit scary. ;) From personal experience I can repeat what Matt has stated. It seems to be related to what NIC you have. I have had crashes with fxp (Intel Pro 100MBit) and bge (Broadcom Gigabit) NICs under moderate network load. Removing ipf reduced but did not eliminate the crashes. debug.mpsafe also reduced but did not eliminate the crashes. Another person on the freebsd-amd64 list reported similar network-related crashes until he switched to em (Intel Gigabit Ethernet) NICs. Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
For me, 5 days up time after switching from IPF to PF. Before the switch a couple of hours of uptime was the maximum. Seems like the crashes are caused by ipfilter. Still same for me :) Uptime almost 20 days now after switching to PF. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Mon, Jul 18, 2005 at 04:09:58PM -0400, Matt Juszczak wrote: Correct. IPF is unstable with our SMP (most of the time) - based 5.x boxes. VERY unstable. VERY VERY unstable. Hm, this sounds bad. What is debug.mpsafenet set to? How big is traffic? I have one SMP box with ipnat, routing some megabits (even during night it's more than 30-40Mbps) without problems, however, ipnat is used only for very small group of hosts right now. But we plan to use ipnat more heavily so it sounds a bit scary. ;) -- Paweł Małachowski ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
I find this messages kind of weird. Are you saying your servers only run long periods of uptime with pf and *not* with ipf? I run a server and almost never put it down. IPF performs very well, including a lot of natting for my home network. Correct. IPF is unstable with our SMP (most of the time) - based 5.x boxes. VERY unstable. VERY VERY unstable. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Mon, 18 Jul 2005 14:32:09 -0400 (EDT) Matt Juszczak [EMAIL PROTECTED] wrote: For me, 5 days up time after switching from IPF to PF. Before the switch a couple of hours of uptime was the maximum. Seems like the crashes are caused by ipfilter. Still same for me :) Uptime almost 20 days now after switching to PF. I find this messages kind of weird. Are you saying your servers only run long periods of uptime with pf and *not* with ipf? I run a server and almost never put it down. IPF performs very well, including a lot of natting for my home network. -- dick -- http://nagual.st/ -- PGP/GnuPG key: F86289CE ++ Running FreeBSD 4.11-stable ++ FreeBSD 5.4 + Nai tiruvantel ar vayuvantel i Valar tielyanna nu vilja ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Tue, 12 Jul 2005, Matt Juszczak wrote: So far a 13 day up time after switching from IPF to PF. If thats not the problem, I hope I find it soon considering this is a production server ... but it seems to be more stable. For me, 5 days up time after switching from IPF to PF. Before the switch a couple of hours of uptime was the maximum. Seems like the crashes are caused by ipfilter. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Could you try SMP kernel without IPF support and without using IPF module? Could you confirm, that your SMP kernel is not crashing when you do not use IPF? Interesting that the box has survived almost two days now, while it was always crashing after at least 8 hours. Anyway, I have compiled a new kernel without ipfilter, I have used pf instead (the configuration changes from ipfilter to pf were mostly minor). We'll see how long the box survives now. Blaz Zupan, Medinet d.o.o, Trzaska 85, SI-2000 Maribor, Slovenia E-mail: [EMAIL PROTECTED], Tel: +386 2 320 6320, Fax: +386 2 320 6325 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Blaz Zupan wrote on 2005-07-12 13:17: Interesting that the box has survived almost two days now, while it was always crashing after at least 8 hours. Anyway, I have compiled a new kernel without ipfilter, I have used pf instead (the configuration changes from ipfilter to pf were mostly minor). We'll see how long the box survives now. Please read thread titled Two Options: which to choose? (2005-06-30 by Matt Juszczak, freebsd-stable), especially Max Laier answer to my mail in this thread. -- * Maciej Wierzbicki * At paranoia's poison door * * VOO1-RIPE VOO1-6BONE * ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Yes, there is absolutely no difference. Disabled HTT in the BIOS and in FreeBSD, the box still crashes. Matt again :) So far a 13 day up time after switching from IPF to PF. If thats not the problem, I hope I find it soon considering this is a production server ... but it seems to be more stable. *Knock On Wood* -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
In order for this problem to not get lost on the freebsd-stable mailing list, I have opened a PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=83220 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Sun, Jul 10, 2005 at 04:58:08PM +0200, Blaz Zupan wrote: In order for this problem to not get lost on the freebsd-stable mailing list, I have opened a PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=83220 Could you try SMP kernel without IPF support and without using IPF module? Could you confirm, that your SMP kernel is not crashing when you do not use IPF? -- * Maciej Wierzbicki * At paranoia's poison door * * VOO1-RIPE VOO1-6BONE * ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Fri, 1 Jul 2005, Kris Kennaway wrote: On Tue, Jun 28, 2005 at 11:26:06AM -0400, Matt Juszczak wrote: After CPUID: 1, the machine locks cold and nothing else is printed to the screen. Try two things: 1) adding 'options KDB_STOP_NMI' to your kernel config. I just learned that you also need to set the debug.kdb.stop_cpus_with_nmi=1 sysctl (e.g. in sysctl.conf). I'm experiencing the same crashes as Matt, but on 5.4-RELEASE-p3. The machine is a HP DL380 G3 and it is heavily loaded (postfix mail server running amavisd-new with antivirus and antispam, so it has heavy IO and CPU load). It does not survive more than a couple of hours, while it is rock stable running 4.11. We have four machines like this, three of them are now again running 4.11 and we left the fourth one at 5.4. We have two other DL380 servers working on our outbound mail queue, but they are not SMP and they are rock stable on 5.4. Without KDB_STOP_NMI, the machine was basically stuck after a crash. Now I've finally landed in the kernel debugger and I have a trace from DDB and have also been able to generate a crashdump with call doadump. If a developer is willing to investigate, I have: - the vmcore file from the crash (its size is 1GB) - the corresponding kernel, compiled with debug symbols - a GIF of the console at the time of the crash with the backtrace at the time of crash - a dmesg from the box (see below) - the kernel config file Please contact me if you want to investigate this further. Just in case, here is a dmesg from the box: Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-RELEASE-p3 #0: Tue Jul 5 18:37:15 CEST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src5/sys/DL380 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 3.06GHz (3049.93-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf29 Stepping = 9 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Hyperthreading: 2 logical CPUs real memory = 1073717248 (1023 MB) avail memory = 1045372928 (996 MB) ACPI APIC Table: COMPAQ 0083 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP): APIC ID: 7 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 Version 1.1 irqs 0-15 on motherboard ioapic1 Version 1.1 irqs 16-31 on motherboard ioapic2 Version 1.1 irqs 32-47 on motherboard ioapic3 Version 1.1 irqs 48-63 on motherboard npx0: math processor on motherboard npx0: INT 16 interface acpi0: COMPAQ P29 on motherboard acpi0: Power Button (fixed) Timecounter ACPI-safe frequency 3579545 Hz quality 1000 acpi_timer0: 32-bit timer at 3.579545MHz port 0x920-0x923 on acpi0 cpu0: ACPI CPU on acpi0 cpu1: ACPI CPU on acpi0 cpu2: ACPI CPU on acpi0 cpu3: ACPI CPU on acpi0 pcib0: ACPI Host-PCI bridge on acpi0 pci0: ACPI PCI bus on pcib0 pci0: display, VGA at device 3.0 (no driver attached) pci0: base peripheral at device 4.0 (no driver attached) pci0: base peripheral at device 4.2 (no driver attached) isab0: PCI-ISA bridge at device 15.0 on pci0 isa0: ISA bus on isab0 atapci0: ServerWorks CSB5 UDMA100 controller port 0x2000-0x200f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 ohci0: OHCI (generic) USB controller mem 0xf5ef-0xf5ef0fff irq 7 at device 15.2 on pci0 usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: OHCI (generic) USB controller on ohci0 usb0: USB revision 1.0 uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered pcib1: ACPI Host-PCI bridge on acpi0 pci1: ACPI PCI bus on pcib1 ciss0: Compaq Smart Array 5i port 0x3000-0x30ff mem 0xf7cf-0xf7cf3fff,0xf7dc-0xf7df irq 30 at device 3.0 on pci1 pcib2: ACPI Host-PCI bridge on acpi0 pci2: ACPI PCI bus on pcib2 bge0: Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002 mem 0xf7ef-0xf7ef irq 29 at device 1.0 on pci2 miibus0: MII bus on bge0 brgphy0: BCM5703 10/100/1000baseTX PHY on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:0e:7f:20:22:91 bge1: Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002 mem 0xf7ee-0xf7ee irq 31 at device 2.0 on pci2 miibus1: MII bus on bge1 brgphy1: BCM5703 10/100/1000baseTX PHY on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:0e:7f:20:22:90 pcib3: ACPI Host-PCI bridge on acpi0 pci3: ACPI PCI bus on pcib3 pcib4: ACPI Host-PCI bridge on acpi0 pci6: ACPI PCI bus on pcib4 pci6: base peripheral, PCI hot-plug controller at device 30.0 (no driver attached) acpi_tz0: Thermal Zone on
Re: FreeBSD -STABLE servers repeatedly crashing.
I'm experiencing the same crashes as Matt, but on 5.4-RELEASE-p3. The machine is a HP DL380 G3 and it is heavily loaded (postfix mail server running amavisd-new with antivirus and antispam, so it has heavy IO and CPU load). It does not survive more than a couple of hours, while it is rock stable running 4.11. We have four machines like this, three of them are now again running 4.11 and we left the fourth one at 5.4. We have two other DL380 servers working on our outbound mail queue, but they are not SMP and they are rock stable on 5.4. CPU: Intel(R) Xeon(TM) CPU 3.06GHz (3049.93-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf29 Stepping = 9 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Hyperthreading: 2 logical CPUs Have you tried to disable HTT? It's doesn't give you alot, and in some cases it decreases performance. regards Claus ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Have you tried to disable HTT? It's doesn't give you alot, and in some cases it decreases performance. Yes, there is absolutely no difference. Disabled HTT in the BIOS and in FreeBSD, the box still crashes. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, Jul 06, 2005 at 09:40:20AM +0200, Blaz Zupan wrote: If a developer is willing to investigate, I have: - the vmcore file from the crash (its size is 1GB) - the corresponding kernel, compiled with debug symbols Please obtain the backtrace with kgdb. Kris pgpoFrkAp3yjc.pgp Description: PGP signature
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, 6 Jul 2005, Kris Kennaway wrote: Please obtain the backtrace with kgdb. Here you go: [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. #0 doadump () at pcpu.h:159 159 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:159 #1 0xc044b006 in db_fncall (dummy1=0, dummy2=0, dummy3=-1067606609, dummy4=0xe4b6c9d0 üÉśä(\205]ŔčÉśäěÉśä\222\a) at /usr/src5/sys/ddb/db_command.c:531 #2 0xc044ae14 in db_command (last_cmdp=0xc0674644, cmd_table=0x0, aux_cmd_tablep=0xc064226c, aux_cmd_tablep_end=0xc0642270) at /usr/src5/sys/ddb/db_command.c:349 #3 0xc044aedc in db_command_loop () at /usr/src5/sys/ddb/db_command.c:455 #4 0xc044ca75 in db_trap (type=12, code=0) at /usr/src5/sys/ddb/db_main.c:221 #5 0xc04e6599 in kdb_trap (type=12, code=0, tf=0xe4b6cb3c) at /usr/src5/sys/kern/subr_kdb.c:468 #6 0xc05f4c79 in trap_fatal (frame=0xe4b6cb3c, eva=36) at /usr/src5/sys/i386/i386/trap.c:812 #7 0xc05f43e9 in trap (frame= {tf_fs = -1040580584, tf_es = -1029439472, tf_ds = 16, tf_edi = -1038000128, tf_esi = -1066898900, tf_ebp = -457782384, tf_isp = -457782424, tf_ebx = -1040530304, tf_edx = -1040524364, tf_ecx = -1040524544, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1068574101, tf_cs = 8, tf_eflags = 65683, tf_esp = 180, tf_ss = 0}) at /usr/src5/sys/i386/i386/trap.c:255 #8 0xc05e283a in calltrap () at /usr/src5/sys/i386/i386/exception.s:140 #9 0xc1fa0018 in ?? () #10 0xc2a40010 in ?? () #11 0x0010 in ?? () #12 0xc2216000 in ?? () #13 0xc0686a2c in tcbinfo () #14 0xe4b6cb90 in ?? () #15 0xe4b6cb68 in ?? () #16 0xc1fac480 in ?? () #17 0xc1fadbb4 in ?? () #18 0xc1fadb00 in ?? () #19 0x in ?? () #20 0x000c in ?? () #21 0x in ?? () #22 0xc04eda6b in propagate_priority (td=0xc2216000) at /usr/src5/sys/kern/subr_turnstile.c:243 #23 0xc04ee225 in turnstile_wait (ts=0xc1fadb00, lock=0xc0686a2c, owner=0xc2216000) at /usr/src5/sys/kern/subr_turnstile.c:556 #24 0xc04c5ced in _mtx_lock_sleep (m=0xc0686a2c, td=0xc1fac480, opts=0, file=0x0, line=0) at /usr/src5/sys/kern/kern_mutex.c:552 #25 0xc0559ad8 in tcp_usr_rcvd (so=0x0, flags=0) at /usr/src5/sys/netinet/tcp_usrreq.c:602 #26 0xc0506103 in soreceive (so=0xc27bf798, psa=0x0, uio=0xe4b6cc88, mp0=0x0, controlp=0x0, flagsp=0x0) at /usr/src5/sys/kern/uipc_socket.c:1395 #27 0xc04f4bd9 in soo_read (fp=0x0, uio=0xe4b6cc88, active_cred=0xc2884a80, flags=0, td=0xc1fac480) at /usr/src5/sys/kern/sys_socket.c:91 #28 0xc04ee865 in dofileread (td=0xc1fac480, fp=0xc2e17bb0, fd=10, buf=0x0, nbyte=4096, offset=Unhandled dwarf expression opcode 0x93 ) at file.h:233 #29 0xc04ee72f in read (td=0xc1fac480, uap=0xe4b6cd14) at /usr/src5/sys/kern/sys_generic.c:107 #30 0xc05f4fe7 in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = -1078001617, tf_edi = 10, tf_esi = 300, tf_ebp = -1077942168, tf_isp = -457781900, tf_ebx = 134822152, tf_edx = 0, tf_ecx = 10, tf_eax = 3, tf_trapno = 0, tf_err = 2, tf_eip = 672556795, tf_cs = 31, tf_eflags = 658, tf_esp = -1077942212, tf_ss = 47}) at /usr/src5/sys/i386/i386/trap.c:1009 #31 0xc05e288f in Xint0x80_syscall () at /usr/src5/sys/i386/i386/exception.s:201 #32 0x002f in ?? () #33 0x002f in ?? () #34 0xbfbf002f in ?? () #35 0x000a in ?? () #36 0x012c in ?? () #37 0xbfbfe868 in ?? () #38 0xe4b6cd74 in ?? () #39 0x08093908 in ?? () #40 0x in ?? () #41 0x000a in ?? () #42 0x0003 in ?? () #43 0x in ?? () #44 0x0002 in ?? () #45 0x281666fb in ?? () #46 0x001f in ?? () #47 0x0292 in ?? () #48 0xbfbfe83c in ?? () #49 0x002f in ?? () #50 0x in ?? () #51 0x in ?? () #52 0x in ?? () #53 0x in ?? () #54 0x2c75b000 in ?? () #55 0xc22de000 in ?? () #56 0xc1fac480 in ?? () #57 0xe4b6ccac in ?? () #58 0xe4b6cc94 in ?? () #59 0xc1f26000 in ?? () #60 0xc04ded13 in sched_switch (td=0x12c, newtd=0x8093908, flags=Cannot access memory at address 0xbfbfe878 ) at /usr/src5/sys/kern/sched_4bsd.c:881 Previous frame inner to this frame (corrupt stack?) (kgdb) quit___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, Jul 06, 2005 at 06:10:20PM +0200, Blaz Zupan wrote: On Wed, 6 Jul 2005, Kris Kennaway wrote: Please obtain the backtrace with kgdb. Here you go: #9 0xc1fa0018 in ?? () #10 0xc2a40010 in ?? () #11 0x0010 in ?? () #12 0xc2216000 in ?? () #13 0xc0686a2c in tcbinfo () #14 0xe4b6cb90 in ?? () #15 0xe4b6cb68 in ?? () #16 0xc1fac480 in ?? () #17 0xc1fadbb4 in ?? () #18 0xc1fadb00 in ?? () #19 0x in ?? () #20 0x000c in ?? () #21 0x in ?? () #22 0xc04eda6b in propagate_priority (td=0xc2216000) at /usr/src5/sys/kern/subr_turnstile.c:243 #23 0xc04ee225 in turnstile_wait (ts=0xc1fadb00, lock=0xc0686a2c, owner=0xc2216000) at /usr/src5/sys/kern/subr_turnstile.c:556 #24 0xc04c5ced in _mtx_lock_sleep (m=0xc0686a2c, td=0xc1fac480, opts=0, file=0x0, line=0) at /usr/src5/sys/kern/kern_mutex.c:552 #25 0xc0559ad8 in tcp_usr_rcvd (so=0x0, flags=0) at /usr/src5/sys/netinet/tcp_usrreq.c:602 Interesting, this seems to finger the TCP code. Are you compiling your kernel with -O2 though (this causes bogus stack frames like you have here)? If so, recompile with -O and try to obtain another trace. CC'ing rwatson. Kris #26 0xc0506103 in soreceive (so=0xc27bf798, psa=0x0, uio=0xe4b6cc88, mp0=0x0, controlp=0x0, flagsp=0x0) at /usr/src5/sys/kern/uipc_socket.c:1395 #27 0xc04f4bd9 in soo_read (fp=0x0, uio=0xe4b6cc88, active_cred=0xc2884a80, flags=0, td=0xc1fac480) at /usr/src5/sys/kern/sys_socket.c:91 #28 0xc04ee865 in dofileread (td=0xc1fac480, fp=0xc2e17bb0, fd=10, buf=0x0, nbyte=4096, offset=Unhandled dwarf expression opcode 0x93 ) at file.h:233 #29 0xc04ee72f in read (td=0xc1fac480, uap=0xe4b6cd14) at /usr/src5/sys/kern/sys_generic.c:107 #30 0xc05f4fe7 in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = -1078001617, tf_edi = 10, tf_esi = 300, tf_ebp = -1077942168, tf_isp = -457781900, tf_ebx = 134822152, tf_edx = 0, tf_ecx = 10, tf_eax = 3, tf_trapno = 0, tf_err = 2, tf_eip = 672556795, tf_cs = 31, tf_eflags = 658, tf_esp = -1077942212, tf_ss = 47}) at /usr/src5/sys/i386/i386/trap.c:1009 #31 0xc05e288f in Xint0x80_syscall () at /usr/src5/sys/i386/i386/exception.s:201 #32 0x002f in ?? () #33 0x002f in ?? () #34 0xbfbf002f in ?? () #35 0x000a in ?? () #36 0x012c in ?? () #37 0xbfbfe868 in ?? () #38 0xe4b6cd74 in ?? () #39 0x08093908 in ?? () #40 0x in ?? () #41 0x000a in ?? () #42 0x0003 in ?? () #43 0x in ?? () #44 0x0002 in ?? () #45 0x281666fb in ?? () #46 0x001f in ?? () #47 0x0292 in ?? () #48 0xbfbfe83c in ?? () #49 0x002f in ?? () #50 0x in ?? () #51 0x in ?? () #52 0x in ?? () #53 0x in ?? () #54 0x2c75b000 in ?? () #55 0xc22de000 in ?? () #56 0xc1fac480 in ?? () #57 0xe4b6ccac in ?? () #58 0xe4b6cc94 in ?? () #59 0xc1f26000 in ?? () #60 0xc04ded13 in sched_switch (td=0x12c, newtd=0x8093908, flags=Cannot access memory at address 0xbfbfe878 ) at /usr/src5/sys/kern/sched_4bsd.c:881 Previous frame inner to this frame (corrupt stack?) (kgdb) quit pgpWbDTNfOSx8.pgp Description: PGP signature
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, 6 Jul 2005, Kris Kennaway wrote: Interesting, this seems to finger the TCP code. Are you compiling your kernel with -O2 though (this causes bogus stack frames like you have here)? If so, recompile with -O and try to obtain another trace. Nope, no funky compile options, all at the default. The only weird thing I'm doing is that the world is built on a 4.11 box and is shared between all our boxes, so that we don't need to compile multiple times. The kernel config is here: machine i386 cpu I686_CPU ident DL380 options SCHED_4BSD # 4BSD scheduler options INET# InterNETworking options INET6 # IPv6 communications protocols options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options MD_ROOT # MD is a potential root device options GEOM_GPT# GUID Partition Tables. options COMPAT_43 # Compatible with BSD 4.3 [KEEP THIS!] options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV# install a CDEV entry in /dev options ADAPTIVE_GIANT # Giant mutex is adaptive. options NMBCLUSTERS=12000 options IPFILTER options IPFILTER_LOG options SMP options INCLUDE_CONFIG_FILE options KDB_STOP_NMI options KDB options DDB makeoptions DEBUG=-g#Build kernel with gdb(1) debug symbols device apic# I/O APIC device isa device eisa device pci device fdc device ata device atapicd # ATAPI CDROM drives options ATA_STATIC_ID # Static device numbering device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device ciss# Compaq Smart RAID 5* device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device vga # VGA video card driver device sc device agp # support several AGP chipsets device npx device pmtimer device sio # 8250, 16[45]50 based serial ports device miibus # MII bus support device bge # Broadcom BCM570xx Gigabit Ethernet device loop# Network loopback device mem # Memory and kernel memory devices device io # I/O device device random # Entropy device device ether # Ethernet support device pty # Pseudo-ttys (telnet etc) device md # Memory disks device bpf # Berkeley packet filter device ohci# OHCI PCI-USB interface device usb # USB Bus (required) device ukbd# Keyboard device ums # Mouse ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, Jul 06, 2005 at 06:20:38PM +0200, Blaz Zupan wrote: On Wed, 6 Jul 2005, Kris Kennaway wrote: Interesting, this seems to finger the TCP code. Are you compiling your kernel with -O2 though (this causes bogus stack frames like you have here)? If so, recompile with -O and try to obtain another trace. Nope, no funky compile options, all at the default. The only weird thing I'm doing is that the world is built on a 4.11 box and is shared between all our boxes, so that we don't need to compile multiple times. The kernel config is here: That should be OK as long as you're not cross-compiling for different architectures. Kris pgpllfkoJ5aoO.pgp Description: PGP signature
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, 6 Jul 2005, Kris Kennaway wrote: That should be OK as long as you're not cross-compiling for different architectures. No, we only have i386 boxes. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Jul 6, 2005, at 6:29 PM, Blaz Zupan wrote: On Wed, 6 Jul 2005, Kris Kennaway wrote: That should be OK as long as you're not cross-compiling for different architectures. No, we only have i386 boxes. Hi, thanks for doing this work. I was working on preparing a similiar set of information, but have been too overworked lately. We have ordered and had delivered a substansial number of DL380 (intel) and DL385 (amd64) machines, that will all be running FreeBSD. However, the recent reports about trouble on these systems has made me wary. Perhaps this will give FreeBSD the solution it needs (I've seen similiar issues on other SMP systems), and me the sleep I need before launch in September ;) Thanks again. Now just hoping it's helpful to someone ;) /Eirik ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable- [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, Jun 29, 2005 at 06:05:35AM -0400, Kris Kennaway wrote: On Tue, Jun 28, 2005 at 11:26:06AM -0400, Matt Juszczak wrote: OK, when it crashes next and is sat at the db prompt, type tr and press enter to get a trace. Copy this down (or have a serial console to capture the output). Also, try typing call doadump() and see if that succeeds in generating a crash dump. How were you trying to generate one before? Gavin I can't type anything. The machine locks up. See: http://paste.atopia.net/126 After CPUID: 1, the machine locks cold and nothing else is printed to the screen. Try two things: 1) adding 'options KDB_STOP_NMI' to your kernel config. I just learned that you also need to set the debug.kdb.stop_cpus_with_nmi=1 sysctl (e.g. in sysctl.conf). Kris pgpw48l0Z9fZN.pgp Description: PGP signature
RE: FreeBSD -STABLE servers repeatedly crashing
On Tue, 28 Jun 2005, Matt Juszczak [EMAIL PROTECTED] wrote: Please try out this patch to aid the above problem with hang instead of dump: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c.diff?r1=1.275r2=1.276 This box is now crashing once every 12 hours. I can't apply this patch :-(. Does anyone have any suggestions on how I can work around this? Some have said its an SMP problem and some have said its a 4 GB RAM problem and some have said its an IPF problem if I disabled all three of those things would that help this box be stable until code could be fixed? Disabling SMP helped im my case. -- How fortunate the man with none. --Dead Can Dance ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Tue, Jun 28, 2005 at 11:26:06AM -0400, Matt Juszczak wrote: OK, when it crashes next and is sat at the db prompt, type tr and press enter to get a trace. Copy this down (or have a serial console to capture the output). Also, try typing call doadump() and see if that succeeds in generating a crash dump. How were you trying to generate one before? Gavin I can't type anything. The machine locks up. See: http://paste.atopia.net/126 After CPUID: 1, the machine locks cold and nothing else is printed to the screen. Try two things: 1) adding 'options KDB_STOP_NMI' to your kernel config. 2) If you still can't get it to break to DDB, then compile up a debugging kernel, run kgdb on it (as described in the developers' handbook), and list *(0xblah) where that address is the value of the instruction pointer in the trap message (e.g. 0xc6644eff in your paste above). That might at least be a start. Kris pgp8ZCS8abEDd.pgp Description: PGP signature
Re: FreeBSD -STABLE servers repeatedly crashing
On Tue, Jun 28, 2005 at 01:50:48PM -0400, Matt Juszczak wrote: M Please try out this patch to aid the above problem with hang instead of M dump: M M http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c.diff?r1=1.275r2=1.276 M This patch wouldn't go through M I tried patching against: M __FBSDID($FreeBSD: src/sys/i386/i386/trap.c,v 1.267.2.3 2005/05/01 M 05:34:46 dwhite Exp $); M which is -STABLE Here is attached patch. It should work for STABLE. It should fix problem with frozen kdb, and give you ability to obtain a crashdump. -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE Index: trap.c === RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v retrieving revision 1.267.2.3 diff -u -r1.267.2.3 trap.c --- trap.c 1 May 2005 05:34:46 - 1.267.2.3 +++ trap.c 29 Jun 2005 14:27:04 - @@ -809,8 +809,15 @@ } #ifdef KDB - if (kdb_trap(type, 0, frame)) - return; + { + register_t eflags; + eflags = intr_disable(); + if (kdb_trap(type, 0, frame)) { + intr_restore(eflags); + return; + } + intr_restore(eflags); + } #endif printf(trap number = %d\n, type); if (type = MAX_TRAP_MSG) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Wed, 29 Jun 2005, Kris Kennaway wrote: On Tue, Jun 28, 2005 at 11:26:06AM -0400, Matt Juszczak wrote: OK, when it crashes next and is sat at the db prompt, type tr and press enter to get a trace. Copy this down (or have a serial console to capture the output). Also, try typing call doadump() and see if that succeeds in generating a crash dump. How were you trying to generate one before? Gavin I can't type anything. The machine locks up. See: http://paste.atopia.net/126 After CPUID: 1, the machine locks cold and nothing else is printed to the screen. Try two things: 1) adding 'options KDB_STOP_NMI' to your kernel config. 2) If you still can't get it to break to DDB, then compile up a debugging kernel, run kgdb on it (as described in the developers' handbook), and list *(0xblah) where that address is the value of the instruction pointer in the trap message (e.g. 0xc6644eff in your paste above). That might at least be a start. Kris OK :) I'll try this next time it crashes. I actually disabled ipf a few nights ago and it hasn't crashed since... knock on wood. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Mon, Jun 27, 2005 at 07:58:18PM -0400, Matt Juszczak wrote: M Can you please build kernel with debugging and obtain a crashdump? M M High activity on the box today caused us to be able to crash it again M within 9 hours. I configured all steps per the developers handbook, but M when I went to do savecore, it said no dumps. M M It appears the machine is completely locked up when it does a kernel trap. M The keyboard is non-responsive, and the machine hangs and doesn't reboot. Please try out this patch to aid the above problem with hang instead of dump: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c.diff?r1=1.275r2=1.276 -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Gleb Smirnoff wrote: On Mon, Jun 27, 2005 at 01:01:09AM -0400, Matt Juszczak wrote: M About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE. M I also turned on procmail globally on our mail server. Here is our M current FreeBSD server setup: M M URANUS - primary ldap M CALIBAN - secondary ldap M ORION - primary mail M M Orion was the first one to crash, about three weeks ago. Orion is M constantly talking to uranus, because uranus is our primary ldap server M (we have a planet scheme), and caliban is our secondary ldap server. I M ran an email flood test on orion to see if I could crash it again. This M time, the high requests on Uranus caused Uranus to crash. With two M different servers on two different hardware setups crashing, I had to M start thinking of what could be causing the problem. M M Memory tests on both servers came back OK. Orion had some ECC errors M which it was able to fix. I wasn't able to catch orion's first crash, M but I was able to catch uranus's first crash: M M http://paste.atopia.net/126 Can you please build kernel with debugging and obtain a crashdump? Ever since I setup the debug kernel the machine is now crashing every 12 hours. I think I have to switch to OpenBSD or 4.11 FreeBSD because this box can't keep crashing. It refuses to do a crash dump. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Tue, 2005-06-28 at 10:49 -0400, Matt Juszczak wrote: Gleb Smirnoff wrote: On Mon, Jun 27, 2005 at 01:01:09AM -0400, Matt Juszczak wrote: M About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE. M I also turned on procmail globally on our mail server. Here is our M current FreeBSD server setup: M M URANUS - primary ldap M CALIBAN - secondary ldap M ORION - primary mail M M Orion was the first one to crash, about three weeks ago. Orion is M constantly talking to uranus, because uranus is our primary ldap server M (we have a planet scheme), and caliban is our secondary ldap server. I M ran an email flood test on orion to see if I could crash it again. This M time, the high requests on Uranus caused Uranus to crash. With two M different servers on two different hardware setups crashing, I had to M start thinking of what could be causing the problem. M M http://paste.atopia.net/126 Can you please build kernel with debugging and obtain a crashdump? Ever since I setup the debug kernel the machine is now crashing every 12 hours. I think I have to switch to OpenBSD or 4.11 FreeBSD because this box can't keep crashing. It refuses to do a crash dump. OK, when it crashes next and is sat at the db prompt, type tr and press enter to get a trace. Copy this down (or have a serial console to capture the output). Also, try typing call doadump() and see if that succeeds in generating a crash dump. How were you trying to generate one before? Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Gavin Atkinson wrote: On Tue, 2005-06-28 at 10:49 -0400, Matt Juszczak wrote: Gleb Smirnoff wrote: On Mon, Jun 27, 2005 at 01:01:09AM -0400, Matt Juszczak wrote: M About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE. M I also turned on procmail globally on our mail server. Here is our M current FreeBSD server setup: M M URANUS - primary ldap M CALIBAN - secondary ldap M ORION - primary mail M M Orion was the first one to crash, about three weeks ago. Orion is M constantly talking to uranus, because uranus is our primary ldap server M (we have a planet scheme), and caliban is our secondary ldap server. I M ran an email flood test on orion to see if I could crash it again. This M time, the high requests on Uranus caused Uranus to crash. With two M different servers on two different hardware setups crashing, I had to M start thinking of what could be causing the problem. M M http://paste.atopia.net/126 Can you please build kernel with debugging and obtain a crashdump? Ever since I setup the debug kernel the machine is now crashing every 12 hours. I think I have to switch to OpenBSD or 4.11 FreeBSD because this box can't keep crashing. It refuses to do a crash dump. OK, when it crashes next and is sat at the db prompt, type tr and press enter to get a trace. Copy this down (or have a serial console to capture the output). Also, try typing call doadump() and see if that succeeds in generating a crash dump. How were you trying to generate one before? Gavin I can't type anything. The machine locks up. See: http://paste.atopia.net/126 After CPUID: 1, the machine locks cold and nothing else is printed to the screen. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Matt Juszczak wrote: Ever since I setup the debug kernel the machine is now crashing every 12 hours. I think I have to switch to OpenBSD or 4.11 FreeBSD because this box can't keep crashing. It refuses to do a crash dump. -Matt Matt, Does it refuse to crash dump or is it that you can't get the core file back? Make sure you have enough disk space in /var/crash for capturing the dump. You need at least as much free disk as you have memory configured. There was a post saying that fsck may be trashing core files if it starts using swap. To maximize the chances of recovering the core file boot into single user after the crash and do the following: fsck -y # or fsck and read every question, if you're paranoid mount -f / # remounts root read/write mount /var savecore /var/crash exit Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
fsck -y# or fsck and read every question, if you're paranoid mount -f /# remounts root read/write mount /var savecore /var/crash exit Gary Gary: After it crashes, it locks up and hangs, no keyboard response, etc. When I reboot, I go into single user mode and do: fsck -p mount -a -t ufs savecore /var/crash /dev/da0s1b (which is my swap) It says no dump available. These instructions are from the handbook. I just got sent a patch a little while ago which apparently will help the system not lock up. I'm going to try it later today and see where it gets me. Thanks, Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: FreeBSD -STABLE servers repeatedly crashing
Please try out this patch to aid the above problem with hang instead of dump: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c.diff?r1=1.275r2=1.276 This box is now crashing once every 12 hours. I can't apply this patch :-(. Does anyone have any suggestions on how I can work around this? Some have said its an SMP problem and some have said its a 4 GB RAM problem and some have said its an IPF problem if I disabled all three of those things would that help this box be stable until code could be fixed? Thanks, Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Matt, Sadly the FreeBSD guys will need more info before a fix is possible. I would suggest you revert back to FreeBSD 5.3, if you can. Even if you get a patch you'd want to do a whole lot of regression testing before putting it in production as it might break something else. Gary, Do you know what the chances are that this problem I'm experiencing is SMP related? I don't mind turning off SMP, and I guess I could for now to see if that runs stable. Otherwise, I think we're going to switch to OpenBSD, because these crashes are occuring so frequently (twice a day)... and as far as the patch and regression testing, if someone sent me a patch right now I would put it on the server, because the server already crashes daily, so a faulty patch wouldn't change much :-(. I appreciate your response. I'm going to do a little more research today before i make my decision on a platform switch. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Gary, Do you know what the chances are that this problem I'm experiencing is SMP related? I don't mind turning off SMP, and I guess I could for now to see if that runs stable. Otherwise, I think we're going to switch to OpenBSD, because these crashes are occuring so frequently (twice a day)... and as far as the patch and regression testing, if someone sent me a patch right now I would put it on the server, because the server already crashes daily, so a faulty patch wouldn't change much :-(. I appreciate your response. I'm going to do a little more research today before i make my decision on a platform switch. Only way to find out is to try. You could build and install the non-SMP kernel and reboot when you can, or let it boot the new kernel next time the system(s) crash. A lot of the issues seem to be SMP-related. I really loaded up a GENERIC 5.4 kernel and wasn't able to get it to panic. What do you have to lose at this point? I would suggest that before committing to OpenBSD you verify that all the hardware/software you have/use is supported under OpenBSD: http://www.daemonnews.org/200104/bsd_family.html http://www.monkey.org/openbsd/archive/misc/0311/msg01803.html As an example: I'm fairly sure OpenBSD has recently dropped (or will drop) support for the Adaptec aac driver as Theo is not happy with Adaptec's response to his queries for interface specs. From what I've head (YMMV) OpenSBD SMP support is not very optimal, possibly because it is likely that it was implemented extremely conservatively. OpenBSD MySQL with two CPUs can be slower than with one: http://software.newsforge.com/article.pl?sid=04/12/27/1243207from=rss Gary ps. it is a case of: cost, speed, reliability - choose any two. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Only way to find out is to try. You could build and install the non-SMP kernel and reboot when you can, or let it boot the new kernel next time the system(s) crash. A lot of the issues seem to be SMP-related. I really loaded up a GENERIC 5.4 kernel and wasn't able to get it to panic. What do you have to lose at this point? I would suggest that before committing to OpenBSD you verify that all the hardware/software you have/use is supported under OpenBSD: http://www.daemonnews.org/200104/bsd_family.html http://www.monkey.org/openbsd/archive/misc/0311/msg01803.html As an example: I'm fairly sure OpenBSD has recently dropped (or will drop) support for the Adaptec aac driver as Theo is not happy with Adaptec's response to his queries for interface specs. From what I've head (YMMV) OpenSBD SMP support is not very optimal, possibly because it is likely that it was implemented extremely conservatively. OpenBSD MySQL with two CPUs can be slower than with one: http://software.newsforge.com/article.pl?sid=04/12/27/1243207from=rss Gary ps. it is a case of: cost, speed, reliability - choose any two. Agreed, Theo just yelled at me cause I was having this discussion on the OpenBSD misc mailing list, which is my fault :-/ ... a lot of people were responding though and I think it just got out of hand. As much as OpenBSD seems nice, my FreeBSD experience is a lot better. I'm going to switch to Uniprocessor and see if that makes us more stable. Hopefully it will. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Hi, I have something like 20 boxes (Dell Power Edge 370, Fujitsu-Siemens PRIMERGY 200 and couple of dual AMD64 Fujitsu-Siemens) servers running 5.4-STABLE. So far, only machine that I have experienced freezing and was unable to get droped into KDB or to get any sort of vmcore was Dell Power Edge 1600SC (dual Xeon 2.4GHz with 4Gb). I have noticed that since it was running squid-2.5 linked to pthread when I have switched to oops which was compiled on 5.2.1 and linked to libc_r that machine stoped crashing (HTT disabled, IPFILTER also disabled configuration GENERIC). However, I have decided to experiment and upgraded to 6.0-CURRENT and so far I haven't experienced any problems - except one panic caused by linux.ko and running edonkeyclc for linux (it was just experiment to see if it will work on 6.0-CURRENT). I suppose that there might be some problems related to SMP on 5.4 and I don't know what for are you using problematic servers and I don't know if it is smart to use 6.0-CURRENT but so far I have positive experince with it on problematic server and would rather stay with FBSD then switching to NetBSD or OpenBSD. Regards, gg. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Hi, I have something like 20 boxes (Dell Power Edge 370, Fujitsu-Siemens PRIMERGY 200 and couple of dual AMD64 Fujitsu-Siemens) servers running 5.4-STABLE. So far, only machine that I have experienced freezing and was unable to get droped into KDB or to get any sort of vmcore was Dell Power Edge 1600SC (dual Xeon 2.4GHz with 4Gb). I have noticed that since it was running squid-2.5 linked to pthread when I have switched to oops which was compiled on 5.2.1 and linked to libc_r that machine stoped crashing (HTT disabled, IPFILTER also disabled configuration GENERIC). However, I have decided to experiment and upgraded to 6.0-CURRENT and so far I haven't experienced any problems - except one panic caused by linux.ko and running edonkeyclc for linux (it was just experiment to see if it will work on 6.0-CURRENT). I suppose that there might be some problems related to SMP on 5.4 and I don't know what for are you using problematic servers and I don't know if it is smart to use 6.0-CURRENT but so far I have positive experince with it on problematic server and would rather stay with FBSD then switching to NetBSD or OpenBSD. With what you're saying, maybe my problem is that I use IPFILTER and maybe it isn't an SMP problem? Should I switch to PF? -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Some people suggested so - pf is supposed to be faster then IPFILTER. However if you are experiencing machine freezing like I did on 5.4-STABLE I'm not sure this will help - if nothing else helps try 6.0-CURRENT. I've also noticed that it is running much faster with all debuging enabled then regular 5.4-STABLE on same hardware... Regards, gg. On Tue, 28 Jun 2005, Matt Juszczak wrote: With what you're saying, maybe my problem is that I use IPFILTER and maybe it isn't an SMP problem? Should I switch to PF? -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Some people suggested so - pf is supposed to be faster then IPFILTER. However if you are experiencing machine freezing like I did on 5.4-STABLE I'm not sure this will help - if nothing else helps try 6.0-CURRENT. I've also noticed that it is running much faster with all debuging enabled then regular 5.4-STABLE on same hardware... I dont think its a good idea to run 6.0-CURRENT production. I'm moving the main mail server to PF, keeping SMP on. Its also running 5.4-STABLE as of today. We'll see if any of this fixes anything. Regards, Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
FreeBSD 5.4-STABLE #11: Fri Apr 8 09:48:24 CDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KSD-SMP 5:02PM up 80 days, 21:08, 1 user, load averages: 4.04, 3.33, 3.01 Yes, SMP is enabled, as is implied by the kernel config tag. (Very busy compilation, web and database server) -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind On Tue, Jun 28, 2005 at 05:55:17PM -0400, Matt Juszczak wrote: Some people suggested so - pf is supposed to be faster then IPFILTER. However if you are experiencing machine freezing like I did on 5.4-STABLE I'm not sure this will help - if nothing else helps try 6.0-CURRENT. I've also noticed that it is running much faster with all debuging enabled then regular 5.4-STABLE on same hardware... I dont think its a good idea to run 6.0-CURRENT production. I'm moving the main mail server to PF, keeping SMP on. Its also running 5.4-STABLE as of today. We'll see if any of this fixes anything. Regards, Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] %SPAMBLOCK-SYS: Matched [freebsd], message ok ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing
Yes, SMP is enabled, as is implied by the kernel config tag. (Very busy compilation, web and database server) Are you using PF? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
On Mon, Jun 27, 2005 at 01:01:09AM -0400, Matt Juszczak wrote: M About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE. M I also turned on procmail globally on our mail server. Here is our M current FreeBSD server setup: M M URANUS - primary ldap M CALIBAN - secondary ldap M ORION - primary mail M M Orion was the first one to crash, about three weeks ago. Orion is M constantly talking to uranus, because uranus is our primary ldap server M (we have a planet scheme), and caliban is our secondary ldap server. I M ran an email flood test on orion to see if I could crash it again. This M time, the high requests on Uranus caused Uranus to crash. With two M different servers on two different hardware setups crashing, I had to M start thinking of what could be causing the problem. M M Memory tests on both servers came back OK. Orion had some ECC errors M which it was able to fix. I wasn't able to catch orion's first crash, M but I was able to catch uranus's first crash: M M http://paste.atopia.net/126 Can you please build kernel with debugging and obtain a crashdump? -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD -STABLE servers repeatedly crashing.
Can you please build kernel with debugging and obtain a crashdump? High activity on the box today caused us to be able to crash it again within 9 hours. I configured all steps per the developers handbook, but when I went to do savecore, it said no dumps. It appears the machine is completely locked up when it does a kernel trap. The keyboard is non-responsive, and the machine hangs and doesn't reboot. Any other suggestions would be greatly appreciated. For now I am going to take the box out of SMP mode which will hopefully keep it stable until I can find some further instructions. Regards, Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]