Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-14 Thread Vahan Yerkanian
Thanks for the comments folks, your points are damn right. I believe what I 
experienced was the data corruption via stale caches.. 

So much for not opening up the case (actually for the first time, I resisted 
due to the lack of time) and not checking if these controllers are somehow 
linked together for write cache etc exchange…
AFAIK they're just connected via 8087s to the backplanes with dual-port SAS 
drives.

Going back to the LSI website to check if these 9750 have an option to link 
their caches into one...
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-14 Thread John R Pierce
On 01/14/12 2:32 PM, Vahan Yerkanian wrote:
 Thanks for the comments folks, your points are damn right. I believe what I 
 experienced was the data corruption via stale caches..

 So much for not opening up the case (actually for the first time, I resisted 
 due to the lack of time) and not checking if these controllers are somehow 
 linked together for write cache etc exchange…
 AFAIK they're just connected via 8087s to the backplanes with dual-port SAS 
 drives.

 Going back to the LSI website to check if these 9750 have an option to link 
 their caches into one...

who built/configured this system?



-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-14 Thread John R Pierce
On 01/14/12 2:32 PM, Vahan Yerkanian wrote:
 Going back to the LSI website to check if these 9750 have an option to link 
 their caches into one...

I was curious, *all* the LSI sAS RAID cards say 'single controller 
multipathing', both the megaraid cards and the 3ware cards (I'm setting 
up a server that has a 9260-8i now).   I looked at their optional 
software, and none of it implements any sort of multi-controller failover.

If you have split or multiple storage backplanes/enclosures, Id split 
the disks between the two cards, and multipath between each controller 
and its respective disks.  this will take another set of SAS cables, of 
course



-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-14 Thread Vahan Yerkanian

On Jan 15, 2012, at 2:52 AM, John R Pierce wrote:

 
 who built/configured this system?
 
 

Someone you don't know. ;) A local distributor. The system was shipped with MS 
OS with MPIO ISCSI targets installed claiming to be tested. Of course I had to 
remove that offending OS and install CentOS :)

You know the rest of the story.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-14 Thread Vahan Yerkanian
On Jan 15, 2012, at 3:52 AM, John R Pierce wrote:

 On 01/14/12 2:32 PM, Vahan Yerkanian wrote:
 Going back to the LSI website to check if these 9750 have an option to link 
 their caches into one...
 
 I was curious, *all* the LSI sAS RAID cards say 'single controller 
 multipathing', both the megaraid cards and the 3ware cards (I'm setting 
 up a server that has a 9260-8i now).   I looked at their optional 
 software, and none of it implements any sort of multi-controller failover.
 
 If you have split or multiple storage backplanes/enclosures, Id split 
 the disks between the two cards, and multipath between each controller 
 and its respective disks.  this will take another set of SAS cables, of 
 course
 

At the moment I have two 9750-4i installed, each having a sff8087 x4 cable 
going to the same backplane containing dual-port sas disks.

This was supposed to be a load balanced, multi-controller failover setup.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-14 Thread John R Pierce
On 01/14/12 3:55 PM, Vahan Yerkanian wrote:
 At the moment I have two 9750-4i installed, each having a sff8087 x4 cable 
 going to the same backplane containing dual-port sas disks.

 This was supposed to be a load balanced, multi-controller failover setup.

AFAIK the only way to achieve that is to use plain SAS HBA's such as the 
LSI 2008 family (92xx cards), not raid controllers.

this is a representive SAS midplane manual
http://www.supermicro.com/manuals/other/BPN-SAS-936EL.pdf

it shows failover combinations starting on page 3-3...


-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-13 Thread Ross Walker
On Friday, January 13, 2012, Vahan Yerkanian va...@arminco.com wrote:
 Hi,

 I was wondering if anyone has successfully configured two lsi/3ware
9750-4i series controllers for multipathing under CentOS 5.7 x86_64?

 I've tried some basic setups with both multibus and failover settings,
and had repeatable filesystem corruption over a iscsi(tgtd) or nfs3
connection.

Have you tried multipathd?

-Ross
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-13 Thread Vahan Yerkanian
On Jan 13, 2012, at 6:33 PM, Ross Walker wrote:

 On Friday, January 13, 2012, Vahan Yerkanian va...@arminco.com wrote:
 Hi,
 
 I was wondering if anyone has successfully configured two lsi/3ware 9750-4i 
 series controllers for multipathing under CentOS 5.7 x86_64?
 
 I've tried some basic setups with both multibus and failover settings, and 
 had repeatable filesystem corruption over a iscsi(tgtd) or nfs3 connection.
 
 Have you tried multipathd?
 
 -Ross

Yes, sorry I should've been more clear. I have configured the multipathing 
using the multipathd using the bare-bone configuration, as it didn't have 
LSI/3Ware controller-specific preset in the devices {} block.

What I did was based on the [1] and in the end consisted of this (I thinned it 
down in the end trying to find the culprit) multipath.conf:

blacklist {
devnode sda # the boot disk
}

defaults {
user_friendly_names yes
}

multipaths {
multipath {
alias storage
wwid 3600050c15400f3ae0904
path_grouping_policy multibus
}
}


multipath -ll showed everything OK, with both sdb and sdc (the same 24 x 3tb 
raid6 array) as active and ready.

However no matter what I did, the filesystem is getting corrupted in 3-4 hours 
of active usage...



[1] 
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/pdf/DM_Multipath/Red_Hat_Enterprise_Linux-5-DM_Multipath-en-US.pdf
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-13 Thread John R Pierce
On 01/13/12 6:41 AM, Vahan Yerkanian wrote:
 multipath -ll showed everything OK, with both sdb and sdc (the same 24 x 3tb 
 raid6 array) as active and ready.


are those controllers aware you're using them for multipathing?RAID 
cards like that tend to have large caches, and one controllers cache 
won't see changes written to the other, leading to inconsistent data, 
unless the controllers have some form of back channel communications 
between them to coordinate their caches.

btw, thats _way_ too many disks in a single disk group, your disk 
rebuild times with 24 x raid6 will be ouch long. I try and limit my raid 
groups to 12 drives max, and stripe those.   given 24 disks, I'd 
probably have 2 hot spares, and 2 x 11 raid60, which would provide the 
space equivalent of 18 disks


-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-13 Thread Ross Walker
On Jan 13, 2012, at 2:37 PM, John R Pierce pie...@hogranch.com wrote:

 On 01/13/12 6:41 AM, Vahan Yerkanian wrote:
 multipath -ll showed everything OK, with both sdb and sdc (the same 24 x 3tb 
 raid6 array) as active and ready.
 
 
 are those controllers aware you're using them for multipathing?RAID 
 cards like that tend to have large caches, and one controllers cache 
 won't see changes written to the other, leading to inconsistent data, 
 unless the controllers have some form of back channel communications 
 between them to coordinate their caches.

John's right, I thought these were straight SAS/SATA controllers.

You will need to publish these disks as straight through individual disks with 
write-through cache and use software RAID if the controllers can't communicate 
with each other.

Some controllers are smart enough to perform multipathing across them but they 
tend to cost more than $500.

The Dell PERC (LSI) RAID controllers I have at work do multipathing on-board 
between multiple connections to each enclosure, but not between multiple 
controllers. To do that I would need two plain SAS/SATA controllers and handle 
RAID in software.

I have done that successfully with Solaris and ZFS in the past, but Linux 
software RAID wasn't performant enough for large RAID6s (in my experience).


 btw, thats _way_ too many disks in a single disk group, your disk 
 rebuild times with 24 x raid6 will be ouch long. I try and limit my raid 
 groups to 12 drives max, and stripe those.   given 24 disks, I'd 
 probably have 2 hot spares, and 2 x 11 raid60, which would provide the 
 space equivalent of 18 disks

I agree with John here too.

Create two RAID6 groups and use software to stripe them, either using mdraid or 
lvm.

If it were me, I'd put each RAID6 on a separate controller for balanced parity 
calculations and then stripe the two volumes in LVM. Keep a third controller as 
a spare in the closet.

-Ross

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-13 Thread John R Pierce
On 01/13/12 3:46 PM, Ross Walker wrote:
 You will need to publish these disks as straight through individual disks 
 with write-through cache and use software RAID if the controllers can't 
 communicate with each other.

write-through cache is not even good enough. if a given block is 
written through one of them, it will land on the disk, but it won't 
update the other controller cache, so the other controller could have 
stale data in its cache and if another read takes that path, it will get 
the old data, and things go downhill from there quickly.



-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LSI/3ware 9750-4i and multipath I/O

2012-01-13 Thread Ross Walker
On Jan 13, 2012, at 6:51 PM, John R Pierce pie...@hogranch.com wrote:

 On 01/13/12 3:46 PM, Ross Walker wrote:
 You will need to publish these disks as straight through individual disks 
 with write-through cache and use software RAID if the controllers can't 
 communicate with each other.
 
 write-through cache is not even good enough. if a given block is 
 written through one of them, it will land on the disk, but it won't 
 update the other controller cache, so the other controller could have 
 stale data in its cache and if another read takes that path, it will get 
 the old data, and things go downhill from there quickly.

And read-through too, bascially disable caching on the controllers.

-Ross

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos