Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
Thanks for the comments folks, your points are damn right. I believe what I experienced was the data corruption via stale caches.. So much for not opening up the case (actually for the first time, I resisted due to the lack of time) and not checking if these controllers are somehow linked together for write cache etc exchange… AFAIK they're just connected via 8087s to the backplanes with dual-port SAS drives. Going back to the LSI website to check if these 9750 have an option to link their caches into one... ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On 01/14/12 2:32 PM, Vahan Yerkanian wrote: Thanks for the comments folks, your points are damn right. I believe what I experienced was the data corruption via stale caches.. So much for not opening up the case (actually for the first time, I resisted due to the lack of time) and not checking if these controllers are somehow linked together for write cache etc exchange… AFAIK they're just connected via 8087s to the backplanes with dual-port SAS drives. Going back to the LSI website to check if these 9750 have an option to link their caches into one... who built/configured this system? -- john r pierceN 37, W 122 santa cruz ca mid-left coast ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On 01/14/12 2:32 PM, Vahan Yerkanian wrote: Going back to the LSI website to check if these 9750 have an option to link their caches into one... I was curious, *all* the LSI sAS RAID cards say 'single controller multipathing', both the megaraid cards and the 3ware cards (I'm setting up a server that has a 9260-8i now). I looked at their optional software, and none of it implements any sort of multi-controller failover. If you have split or multiple storage backplanes/enclosures, Id split the disks between the two cards, and multipath between each controller and its respective disks. this will take another set of SAS cables, of course -- john r pierceN 37, W 122 santa cruz ca mid-left coast ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On Jan 15, 2012, at 2:52 AM, John R Pierce wrote: who built/configured this system? Someone you don't know. ;) A local distributor. The system was shipped with MS OS with MPIO ISCSI targets installed claiming to be tested. Of course I had to remove that offending OS and install CentOS :) You know the rest of the story. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On Jan 15, 2012, at 3:52 AM, John R Pierce wrote: On 01/14/12 2:32 PM, Vahan Yerkanian wrote: Going back to the LSI website to check if these 9750 have an option to link their caches into one... I was curious, *all* the LSI sAS RAID cards say 'single controller multipathing', both the megaraid cards and the 3ware cards (I'm setting up a server that has a 9260-8i now). I looked at their optional software, and none of it implements any sort of multi-controller failover. If you have split or multiple storage backplanes/enclosures, Id split the disks between the two cards, and multipath between each controller and its respective disks. this will take another set of SAS cables, of course At the moment I have two 9750-4i installed, each having a sff8087 x4 cable going to the same backplane containing dual-port sas disks. This was supposed to be a load balanced, multi-controller failover setup. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On 01/14/12 3:55 PM, Vahan Yerkanian wrote: At the moment I have two 9750-4i installed, each having a sff8087 x4 cable going to the same backplane containing dual-port sas disks. This was supposed to be a load balanced, multi-controller failover setup. AFAIK the only way to achieve that is to use plain SAS HBA's such as the LSI 2008 family (92xx cards), not raid controllers. this is a representive SAS midplane manual http://www.supermicro.com/manuals/other/BPN-SAS-936EL.pdf it shows failover combinations starting on page 3-3... -- john r pierceN 37, W 122 santa cruz ca mid-left coast ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On Friday, January 13, 2012, Vahan Yerkanian va...@arminco.com wrote: Hi, I was wondering if anyone has successfully configured two lsi/3ware 9750-4i series controllers for multipathing under CentOS 5.7 x86_64? I've tried some basic setups with both multibus and failover settings, and had repeatable filesystem corruption over a iscsi(tgtd) or nfs3 connection. Have you tried multipathd? -Ross ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On Jan 13, 2012, at 6:33 PM, Ross Walker wrote: On Friday, January 13, 2012, Vahan Yerkanian va...@arminco.com wrote: Hi, I was wondering if anyone has successfully configured two lsi/3ware 9750-4i series controllers for multipathing under CentOS 5.7 x86_64? I've tried some basic setups with both multibus and failover settings, and had repeatable filesystem corruption over a iscsi(tgtd) or nfs3 connection. Have you tried multipathd? -Ross Yes, sorry I should've been more clear. I have configured the multipathing using the multipathd using the bare-bone configuration, as it didn't have LSI/3Ware controller-specific preset in the devices {} block. What I did was based on the [1] and in the end consisted of this (I thinned it down in the end trying to find the culprit) multipath.conf: blacklist { devnode sda # the boot disk } defaults { user_friendly_names yes } multipaths { multipath { alias storage wwid 3600050c15400f3ae0904 path_grouping_policy multibus } } multipath -ll showed everything OK, with both sdb and sdc (the same 24 x 3tb raid6 array) as active and ready. However no matter what I did, the filesystem is getting corrupted in 3-4 hours of active usage... [1] http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/pdf/DM_Multipath/Red_Hat_Enterprise_Linux-5-DM_Multipath-en-US.pdf ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On 01/13/12 6:41 AM, Vahan Yerkanian wrote: multipath -ll showed everything OK, with both sdb and sdc (the same 24 x 3tb raid6 array) as active and ready. are those controllers aware you're using them for multipathing?RAID cards like that tend to have large caches, and one controllers cache won't see changes written to the other, leading to inconsistent data, unless the controllers have some form of back channel communications between them to coordinate their caches. btw, thats _way_ too many disks in a single disk group, your disk rebuild times with 24 x raid6 will be ouch long. I try and limit my raid groups to 12 drives max, and stripe those. given 24 disks, I'd probably have 2 hot spares, and 2 x 11 raid60, which would provide the space equivalent of 18 disks -- john r pierceN 37, W 122 santa cruz ca mid-left coast ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On Jan 13, 2012, at 2:37 PM, John R Pierce pie...@hogranch.com wrote: On 01/13/12 6:41 AM, Vahan Yerkanian wrote: multipath -ll showed everything OK, with both sdb and sdc (the same 24 x 3tb raid6 array) as active and ready. are those controllers aware you're using them for multipathing?RAID cards like that tend to have large caches, and one controllers cache won't see changes written to the other, leading to inconsistent data, unless the controllers have some form of back channel communications between them to coordinate their caches. John's right, I thought these were straight SAS/SATA controllers. You will need to publish these disks as straight through individual disks with write-through cache and use software RAID if the controllers can't communicate with each other. Some controllers are smart enough to perform multipathing across them but they tend to cost more than $500. The Dell PERC (LSI) RAID controllers I have at work do multipathing on-board between multiple connections to each enclosure, but not between multiple controllers. To do that I would need two plain SAS/SATA controllers and handle RAID in software. I have done that successfully with Solaris and ZFS in the past, but Linux software RAID wasn't performant enough for large RAID6s (in my experience). btw, thats _way_ too many disks in a single disk group, your disk rebuild times with 24 x raid6 will be ouch long. I try and limit my raid groups to 12 drives max, and stripe those. given 24 disks, I'd probably have 2 hot spares, and 2 x 11 raid60, which would provide the space equivalent of 18 disks I agree with John here too. Create two RAID6 groups and use software to stripe them, either using mdraid or lvm. If it were me, I'd put each RAID6 on a separate controller for balanced parity calculations and then stripe the two volumes in LVM. Keep a third controller as a spare in the closet. -Ross ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On 01/13/12 3:46 PM, Ross Walker wrote: You will need to publish these disks as straight through individual disks with write-through cache and use software RAID if the controllers can't communicate with each other. write-through cache is not even good enough. if a given block is written through one of them, it will land on the disk, but it won't update the other controller cache, so the other controller could have stale data in its cache and if another read takes that path, it will get the old data, and things go downhill from there quickly. -- john r pierceN 37, W 122 santa cruz ca mid-left coast ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] LSI/3ware 9750-4i and multipath I/O
On Jan 13, 2012, at 6:51 PM, John R Pierce pie...@hogranch.com wrote: On 01/13/12 3:46 PM, Ross Walker wrote: You will need to publish these disks as straight through individual disks with write-through cache and use software RAID if the controllers can't communicate with each other. write-through cache is not even good enough. if a given block is written through one of them, it will land on the disk, but it won't update the other controller cache, so the other controller could have stale data in its cache and if another read takes that path, it will get the old data, and things go downhill from there quickly. And read-through too, bascially disable caching on the controllers. -Ross ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos