Hi, A couple of weeks ago I installed FreeBSD 8.2RC1 on a new machine (8.1 was having issues with the raid card, since 8.2 is nearly final I figured... why not). The machine has been running smoothly for a while, even while load- testing the harddrives and network for more than 24 hours.
Since everything was running smoothly I decided to move one of the production PostgreSQL databases to this machine. However... after a couple of hours I got the following error from the iDRAC: PCIE Fatal Err: Critical Event sensor, bus fatal error (Slot 3) was asserted Followed by a lot of garbled text in the console (see the full log in the attachment) and immediately this message: Jan 20 21:09:25 sh4 kernel: NMI ISA 30, EISA ff Jan 20 21:09:25 sh4 kernel: NMI ... going to debugger Jan 20 21:09:25 sh4 kernel: NMI ISA 30, EISA ff Jan 20 21:09:25 sh4 kernel: NMI ... going to debugger Jan 20 21:09:25 sh4 kernel: NMI ISA N2M0I, I ESIASNA Mff2 Followed by this: Jan 20 21:09:38 sh4 kernel: igb0: Watchdog timeout -- resetting Jan 20 21:09:38 sh4 kernel: igb0: Queue(0) tdh = 944, hw tdt = 945 Jan 20 21:09:38 sh4 kernel: igb0: TX(0) desc avail = 1023,Next TX to Clean = 944 Jan 20 21:09:38 sh4 kernel: igb0: link state changed to DOWN Jan 20 21:09:41 sh4 kernel: igb0: link state changed to UP After which the lagg0 interface (which is using igb0 and igb1 as an lacp trunk) marks the igb0 interface as down. After a while the second interface got the same issue which caused the lagg0 interface to become non-functional and the server unreachable. This error looks quite a bit like the one talked about in this thread: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=81462+0+/usr/local/www/db/text/2010/freebsd- net/20100801.freebsd-net But the given solution there (disabling polling) won't help because I don't even have device polling enabled in the kernel. For the record, the machine regularly shows small amounts of garbled text even outside of these network interface crashes as can be seen in the "garbled.log" file. The real crash starts at 21:09:24 according to the iDRAC log. My kernel config is mainly stock, some modules disabled. DEVICE_POLLING is not enabled. The garbled text should be caused by the print buffer since I do have PRINTF_BUFR_SIZE=128 in the config. Thanks in advance for any help. ~rick
signature.asc
Description: This is a digitally signed message part.