Re: [PATCH md 2 of 4] Fix raid6 problem
On Sun, 13 Feb 2005, Mark Hahn wrote: Interesting - the private mail was from me, and I've got two dual Opterons in service. The one with significantly more PCI activity has significantly more problems then the one with less PCI activity. that's pretty odd, since the most intense IO devices I know of are cluster interconnect (quadrics, myrinet, infiniband), and those vendors *love* opterons. I've never heard any of them say other than that Opteron IO handling is noticably better than Intel's. otoh, I could easily believe that if you're running the Opteron systems in acts-like-a-faster-xeon mode (ie, not x86_64), you might be exercising some less-tested paths. I was about to post that I've solved my problems with that Tyan dual opteron motherboard, but it's still crap. I upgraded the BIOS to the 2.02 beta and it seemed to work a lot better. Still couldn't boot off it with all 8 drives in, but solved that with the use of a 32MB flash IDE unit holding /boot... However, it dropped a drive during initial sync of the raid6 arrays with lots of SCSI errors, and had given lots of DMA interrupt missing, etc. thorugh the day when I've run soaktests on it, so I'm going to conclude that that Tyan motherboard is utterly useless and deserves nothing more than being driven over. Slowly. With a steam roller. Then jumped on. Just to make me feel better. Gordon - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH md 2 of 4] Fix raid6 problem
Mark Hahn wrote: Interesting - the private mail was from me, and I've got two dual Opterons in service. The one with significantly more PCI activity has significantly more problems then the one with less PCI activity. that's pretty odd, since the most intense IO devices I know of are cluster interconnect (quadrics, myrinet, infiniband), and those vendors *love* opterons. I've never heard any of them say other than that Opteron IO handling is noticably better than Intel's. Sure, but which variables are changed between the rigs the vendors loved, and the rig we're having problems with? otoh, I could easily believe that if you're running the Opteron systems in acts-like-a-faster-xeon mode (ie, not x86_64), you might be exercising some less-tested paths. Its running x86_64 (Fedora Core 3) and the problem is rooted in the chipset I believe. I don't think its Opterons per se, I think its just the Athlon take two - which is to say that its a wonderful chip, but some of the chipsets its saddled with are horrible, and careful selection (as well as heavy testing prior to putting a machine in service) is essential. -Mike - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH md 2 of 4] Fix raid6 problem
Mike Hardy wrote: Its running x86_64 (Fedora Core 3) and the problem is rooted in the chipset I believe. I don't think its Opterons per se, I think its just the Athlon take two - which is to say that its a wonderful chip, but some of the chipsets its saddled with are horrible, and careful selection (as well as heavy testing prior to putting a machine in service) is essential. I'd be interested to hear the outcome, as I'll be looking at Opteron systems soon and having been badly bitten on dual Athlon (AMD768 Southbridge), would prefer to avoid a repeat of the experience. Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH md 2 of 4] Fix raid6 problem
Gordon Henderson wrote: Anyone using Tyan Thunder K8W motherboards??? I now know, there is a K8S (server?) version of that mobo, but at the time it was all orderd, I wasn't aware of it - my thoughts are there there is some sort of PCI/PCI-X problem with either the motherboard or the chipset, and in all probability the K8S mobo will have the same chipsset and same problems anyway... I'm using a K8W at work as a driver client for NAS testing. Onboard Broadcom GigE, Linksys Marvell GigE, 2xWD1200JD + 2xMaxtor Maxline Plus II as RAID-0 and RAID-5 using the Sil_3114, 2.4.29, raidtools 1.0. 2+2x1GB PC-2700 in first and third slots for each CPU. All PCI-X/HT configs set to Auto in BIOS and Jumpers, 2.02b BIOS. No issues except for a bad SATA cable. Striping yields ~90MB/s, RAID-5 about 65r, 55w on 8GB Bonnie++ runs, 2GB dd reads on raw devices yields ~55MB/s Fedora Core 2 tests next week. -- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH md 2 of 4] Fix raid6 problem
Gordon Henderson wrote: What I wanted was an 8-way RAID-1 for the boot partition (all of /, in reality) and I've done this many times in the past on other 2-5 way systems without issue. So I do the stuff I've done in the past, and theres nothing really new to me in that respect. (I'm using LILO) So when I try to get it to boot off the md device, it boots and says LIL and then nothing more. (Lilo diagnostics interpret this as a media failure, or geometry mismatch) If I make it boot off /dev/sda1 then it would work. We put /boot on 100MB /dev/sda1 partition, rest of drive is md. Lilo script section does dd if=/dev/sda of=/boot/boot446.sda bs=446 count=1 \ fdisk -l /dev/sda /boot/fdisk.sda \ dd if=/dev/sda1 of=/dev/sdb1 every time a new kernel is built. Recovery is much easier without RAID involved (lilo 22.6). I've considered manipulating the boot block/disk label on copy so that it would boot off any off sda1 or sdb1 transparently. (ie. boot off /dev/sda1, root on /dev/md1, an 8-way RAID-1) I tried many combinations of old (Debian woody) new Lilo (compiled from the latest source), I even tried GRUB at one point with no luck either. It was more frustrating as the turn-around time is several minutes by the time you go through the BIOS to change the boot device, then reboot, change lilo.conf, then try again )-: It seemed more stable with just one PCI card in, so I have a 4-port card on order as a last ditch attempt to make it work - I did try re-flashing the BIOS on one board, (I have 2) as it seemed to be about a year old and there are several updates on the Tyan web-site, however that resulted in wiping out the BIOS - it seemed to be going just fine, then it went beep and was silent forever more )-: Anyone in the SW have a flash programmer/copier handy??? Flashing from 1.x to 2.02b, same problem. Power off, pull plug, pull both power connectors off mobo, wait 15 seconds, clear CMOS for 15 seconds, reboot, reset BIOS, no worries. Maybe, or maybe we just move to an Intel system, although power dissipation was a consideration and the Opterons are attractive in that aspect... The case has a 600W PSU before anyone asks.. Yeech. Get the 8131's working and you'll never go back. No Northbridge bottlenecks, thank you. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html