Mark, It may not help at all - but what kind of network interface hardware are you using?
We've seen occasional, strange dropoffs of interfaces based on RealTek chips. Odd, because one virtual interface will drop off, while others, over the same hardware, stay live. Have not had good luck sorting the problem; except that everyone is saying 'get rid of RealTek' hardware, usually recommending Intel. Interested in your comments. Lou Picciano ----- Original Message ----- From: "Mark" <mark0...@gmail.com> To: "Discussion list for OpenIndiana" <openindiana-discuss@openindiana.org> Sent: Friday, February 25, 2011 2:59:43 AM Subject: [OpenIndiana-discuss] It just trashed itself!! I had an interesting issue today with one of my Open Indiana storage servers. It has around 15 smb/nfs shares and 40Tb of storage. The problems may have slowly crept up on it, as logs from the nfs client showed slow response issues starting about 12 hours earlier. Eventually it had ground to a halt, and would not complete a console login. I achieved a normal shut-down via the power button, but on reboot it was somewhat stuffed. On power up, it dropped into single user mode, due to networking issues. A 'dladm show-phys' revealed some missing network devices. The box has two to on-board and a quad gigabit card as igb devices, as well as a dual 10Gbit ixgbe, but only 3 x igb and 1 x ixgbe devices showed up. I tried another reboot, but that didn't help much either, as some were still missing. Then a reboot - -r, and that resulted in all the network devices disappearing. Suspecting possible hardware issues, I booted of the text installation cdrom, and found all the network devices were present and correct. A zpool import & scrub of the OS mirror showed no issues either. About an hour later, after a full OS reinstall and reconfigure, it was back up in production, thanks to the real virtues of zfs - recovery and portability, with smb and nfs shares intact. (I have build a raw vm workstation Open Solaris on a sata disk , moved it to an AMD and then Intel processor box, and had no problems just booting it up) I've saved one of the mirrored OS disks for a post-mortem, to try to find out what happened. Some of the errors on screen suggested write issues to some /dev/ devices, but when a production system is down, rapid recovery is always the primary goal, and analysis took a back seat. I've been slowly, (try moving 40Tb in a hurry and keeping data available), upgrading the Open Solaris boxes to Open Indiana to resolve the scrub impact and some of the other issues I had encountered. These have been very reliable for up to two years so far. The oldest has been up for about a year, but this one only a month. Hopefully this isn't a regular event, but I may keep a pre-built OS disk ready just in case. If anyone has suggestions on what to look for in the wreckage, it would be helpful. Mark. [Sparing a thought for Christchurch Earthquake victims. Thankfully, my family there are all safe.] _______________________________________________ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss _______________________________________________ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss