A system failure of this sort (one which leaves no log entries of any
kind) is generally a hardware fault; memory stick failures tend to
cause kernel panics and easy repeatability.

I would suggest examining the hardware components, the motherboard
could have some faulty capacitors (burst, leaking, or swollen); the
fans on the processors could be failing causing a lockup, the power
supply fans could be failing causing an undervolt and lockup, but this
usually makes the system reset.

You get the idea, your symptoms are pointing to hardware issues in my opinion.

David

Or as I've seen a few times a power supply that cannot handle the load. You have 2 CPU's and a few hard disks which are sucking electricity. What rating power supply are you using? I've found FreeBSD to be finicky about hardware. If the hardware is all good it works perfectly and never lets you down. Something starts going faulty and FreeBSD hangs. Other OS's tend to chug along unpredictably instead.

If it's not the power supply it's possibly the raid card. I'm asuming you used the same raid card when you moved the drives to the other server.

Just my 2c

-Clay


On 10/31/07, Дмитрий Комалеев <[EMAIL PROTECTED]> wrote:
Hello everybody

I have a big problem

There is one FreeBSD server in our company. The server platform is: Supermicro SuperServer 6014V-T2B (2x Intel Xeon 2.8, 1Gb RAM, 3WARE 3W-8006-2LP RAID-Controller).
The server works as:
- a gateway between LAN and Internet
- an Intranet web- and database server (Apache + MySQL + PHP)
- a firewall (OpenBSD pf)
- a transparent proxy server (Squid)
A mounthly traffic through this server is about 100Gb. There is about 200 internet users in our conpany.
Here is a part of my dmesg-listing:

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RELEASE-p8 #2: Thu Oct 11 19:51:25 MSD 2007
    [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KERNEL01_NOSMP
module_register: module pci/em already exists!
Module pci/em failed to register: 17
ACPI APIC Table: <A M I  OEMAPIC >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.12-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x641d<SSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,<b14>>
  AMD Features=0x20000000<LM>
  Logical CPUs per core: 2
real memory  = 1073479680 (1023 MB)
avail memory = 1041465344 (993 MB)
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ichwd module loaded
kbd1 at kbdmux0
ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
acpi0: <A M I OEMRSDT> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_throttle0: <ACPI CPU Throttling> on cpu0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 3.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci3: <ACPI PCI bus> on pcib3
twe0: <3ware Storage Controller. Driver version 1.50.01.002> port 0xbc00-0xbc0f mem 0xfc9ffc00-0xfc9ffc0f,0xfc000000-0xfc7fffff irq 24 at device 1.0 on pci3
twe0: [GIANT-LOCKED]
twe0: 2 ports, Firmware FE8S 1.05.00.068, BIOS BE7X 1.08.00.048
em0: <Intel(R) PRO/1000 Network Connection Version - 6.6.6> port 0xb800-0xb83f mem 0xfc9c0000-0xfc9dffff irq 26 at device 3.0 on pci3
em0: Ethernet address: 00:30:48:58:4d:2a
em0: [FAST]
em1: <Intel(R) PRO/1000 Network Connection Version - 6.6.6> port 0xb400-0xb43f mem 0xfc9a0000-0xfc9bffff irq 27 at device 4.0 on pci3
em1: Ethernet address: 00:30:48:58:4d:2b
em1: [FAST]
uhci0: <UHCI (generic) USB controller> port 0xe800-0xe81f irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <UHCI (generic) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <UHCI (generic) USB controller> port 0xec00-0xec1f irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <UHCI (generic) USB controller> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0: <base peripheral> at device 29.4 (no driver attached)
pci0: <base peripheral, interrupt controller> at device 29.5 (no driver attached) ehci0: <Intel 6300ESB USB 2.0 controller> mem 0xfebffc00-0xfebfffff irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
usb2: EHCI version 1.0
usb2: companion controllers, 2 ports each: usb0 usb1
usb2: <Intel 6300ESB USB 2.0 controller> on ehci0
usb2: USB revision 2.0
uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 4 ports with 4 removable, self powered
pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pci4: <display, VGA> at device 5.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel 6300ESB UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
ichwd0: <Intel 6300ESB watchdog timer> on isa0
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9800-0xca7ff,0xca800-0xcb7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 2800118202 Hz quality 800
Timecounters tick every 1.000 msec
acd0: CDROM <CD-224E-N/1.AA> at ata0-master UDMA33
twed0: <Unit 0, TwinStor, Normal> on twe0
twed0: 152626MB (312579760 sectors)
Trying to mount root from ufs:/dev/twed0s1a
ext0: link state changed to UP
int0: link state changed to UP
vlan0: link state changed to UP

This server hangs up every day without any messages in the log files and on the system console. A keyboard dosen't work too. I can make only hard reset and after restart coredump files are not appearing.
Here is my kernel configuration file:

include GENERIC
ident           KERNEL01_NOSMP
device          ichwd # Intel ICH watchdog timer
#options        SMP
options         ALTQ
options         ALTQ_CBQ
options         ALTQ_RED
options         ALTQ_RIO
options         ALTQ_HFSC
options         ALTQ_PRIQ
#options                ALTQ_NOPCC
options         SC_DISABLE_REBOOT
options         MP_WATCHDOG
options         SW_WATCHDOG

If I make and install a kernel with SMP options the system under working load begins hang up every two hours.

The two days "Memtest" gave no result.
I tried to install the newest Intel ethernet adapter driver, but without any results. As an experiment I tried also to plug a system HDD to another sever platform (SuperServer 6015V-TB), but system hanging didn't stop.
I think that it is not only hardware problem.
Linux (Gentoo) and Windows server 2003 on this hardware were working fine.

Please help me to find a solution and solve a problem.

Your faithfully
Dmitry Komaleev
IT Manager
"EDIPRESSE-KONLIGA" http://www.konliga.ru
Russia, Moscow
tel.:  +7 (495) 775-14-35, ext. 169
fax:   +7 (495) 775-14-34

P.S. I have written the Bug Report on my problem but have received only one advice to turn off ACPI-option. If I disable ACPI, then the RAID-controller and both of the ethernet controllers on my server recieve the same IRQ. I believe this is not good.
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"




--------------------------------------------------------------------------------


_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to