Re: How to create initrd.img to boot LVM-on-RAID0?
Ian Ward Comfort wrote: : On Oct 3, 2007, at 10:18 AM, Dean S. Messing wrote: : > I've created a software RAID-0, defined a Volume Group on in with : > (currently) a single logical volume, and copied my entire : > installation onto it, modifying the copied fstab to reflect where : > the new "/" is. : : > mkinitrd --preload raid0 --with=raid0 initrd_raid.img 2.6.22.5-76-fc7 : : > But the thing won't complete the boot process. From the boot : > messages it appears to not be starting the array, so when it goes : > to scan for LVs it doesn't find the one that's sitting on top of : > the array where root lives. : > : > Are there instructions for how to make this work? I've googled for : > a couple of hours, tried a bunch of stuff, but can't get it to : > work. From what I've read I suspect I must hand-tweek the "init" : > file in the initrd. : : You're probably correct in that your initrd's nash script currently : lacks any "raidautorun" directives, since mkinitrd is looking at your : old fstab and finding that your old root device is not on RAID. : : Instead of specifying modules yourself and hand-tweaking the nash : script, just ask mkinitrd to figure out the correct modules to load : and RAID devices to start on its own, given your chosen boot device, : with the --fstab option. : : mkinitrd --fstab=/newroot/etc/fstab initrd_raid.img 2.6.22.5-76-fc7 That did the trick! Thanks a lot, Ian. Dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to create initrd.img to boot LVM-on-RAID0?
Goswin von Brederlow wrote: : "Dean S. Messing" <[EMAIL PROTECTED]> writes: : : > I'm having the devil of a time trying to boot off : > an "LVM-on-RAID0" device on my Fedora 7 system. : > : > I've created a software RAID-0, defined a Volume Group on in with : > (currently) a single logical volume, and copied my entire : > installation onto it, modifying the copied fstab to reflect where : > the new "/" is. : > : > I created a new initrd with: : > : > mkinitrd --preload raid0 --with=raid0 initrd_raid.img 2.6.22.5-76-fc7 : > : > The LVM modules are getting included in the initrd "for free" because : > I'm currently running on a non-raid LV-managed file system. : > : > I added a stanza to grub.conf for the new initrd.img. : > : > But the thing won't complete the boot process. : >>From the boot messages it appears to not : > be starting the array, so when it goes to scan for LVs it doesn't : > find the one that's sitting on top of the array where root lives. : : Maybe your lvm.conf filters it out or the devices are missing to : access it? I am not at a point in my meagre understanding to fool with it. When I unpack the initrd, I see the raid modules but nothing in the "init" script that activtates the array. The lvm.conf file is the default one.. : > Are there instructions for how to make this work? I've googled for : > a couple of hours, tried a bunch of stuff, but can't get it to : > work. From what I've read I suspect I must hand-tweek the "init" : > file in the initrd. : > : > Surely there is "a right way" to do this. : : Install debian, live happily ever after. :) No comment. : By the way, why bother with raid0? lvm can do striping on its own : saving you one layer alltogether and you already have lvm working : right. Why needlessly add problems to your working system? I'm experimenting (and learning) right now. I'm aware of the striped lv option, which I will also try. Thanks. By the way the current system is working with LVM because at installation time I told it to use LVM on the installed system. Dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
On Wed, 3 Oct 2007 13:36:39 -0700, David Rees wrote: > > # xfs_db -c frag -f /dev/md0 > > actual 1828276, ideal 1708782, fragmentation factor 6.54% > > > > Good or bad? > > Not bad, but not that good, either. Try running xfs_fsr into a nightly > cronjob. By default, it will defrag mounted xfs filesystems for up to > 2 hours. Typically this is enough to keep fragmentation well below 1%. Worth a shot. > -Dave Andrew - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
On Wed, 3 Oct 2007 16:35:21 -0400 (EDT), Justin Piszcz wrote: > What does cat /sys/block/md0/md/mismatch_cnt say? $ cat /sys/block/md0/md/mismatch_cnt 0 > That fragmentation looks normal/fine. Cool. > Justin. Andrew - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
Andrew Clayton wrote: Yeah, I was wondering about that. It certainly hasn't improved things, it's unclear if it's made things any worse.. Many 3124 cards are PCI-X, so if you have one of these (and you seem to be using a server board which may well have PCI-X), bus performance is not going to be an issue. Regards, Richard - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
On 10/3/07, Andrew Clayton <[EMAIL PROTECTED]> wrote: > On Wed, 3 Oct 2007 12:43:24 -0400 (EDT), Justin Piszcz wrote: > > Have you checked fragmentation? > > You know, that never even occurred to me. I've gotten into the mind set > that it's generally not a problem under Linux. It's probably not the root cause, but certainly doesn't help things. At least with XFS you have an easy way to defrag the filesystem without even taking it offline. > # xfs_db -c frag -f /dev/md0 > actual 1828276, ideal 1708782, fragmentation factor 6.54% > > Good or bad? Not bad, but not that good, either. Try running xfs_fsr into a nightly cronjob. By default, it will defrag mounted xfs filesystems for up to 2 hours. Typically this is enough to keep fragmentation well below 1%. -Dave - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
What does cat /sys/block/md0/md/mismatch_cnt say? That fragmentation looks normal/fine. Justin. On Wed, 3 Oct 2007, Andrew Clayton wrote: On Wed, 3 Oct 2007 12:43:24 -0400 (EDT), Justin Piszcz wrote: Have you checked fragmentation? You know, that never even occurred to me. I've gotten into the mind set that it's generally not a problem under Linux. xfs_db -c frag -f /dev/md3 What does this report? # xfs_db -c frag -f /dev/md0 actual 1828276, ideal 1708782, fragmentation factor 6.54% Good or bad? Seeing as this filesystem will be three years old in December, that doesn't seem overly bad. I'm currently looking to things like http://lwn.net/Articles/249450/ and http://lwn.net/Articles/242559/ for potential help, fortunately it seems I won't have too long to wait. Justin. Cheers, Andrew - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
On Wed, 03 Oct 2007 19:53:08 +0200, Goswin von Brederlow wrote: > Andrew Clayton <[EMAIL PROTECTED]> writes: > > > Hi, > > > > Hardware: > > > > Dual Opteron 2GHz cpus. 2GB RAM. 4 x 250GB SATA hard drives. 1 > > (root file system) is connected to the onboard Silicon Image 3114 > > controller. The other 3 (/home) are in a software RAID 5 connected > > to a PCI Silicon Image 3124 card. I moved the 3 raid disks off the > > on board controller onto the card the other day to see if that > > would help, it didn't. > > I would think the onboard controller is connected to the north or > south bridge and possibly hooked directly into the hyper > transport. The extra controler is PCI so you are limited to > theoretical 128MiB/s. For me the onboard chips do much better (though > at higher cpu cost) than pci cards. Yeah, I was wondering about that. It certainly hasn't improved things, it's unclear if it's made things any worse.. > MfG > Goswin Cheers, Andrew - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
On Wed, 3 Oct 2007 12:43:24 -0400 (EDT), Justin Piszcz wrote: > Have you checked fragmentation? You know, that never even occurred to me. I've gotten into the mind set that it's generally not a problem under Linux. > xfs_db -c frag -f /dev/md3 > > What does this report? # xfs_db -c frag -f /dev/md0 actual 1828276, ideal 1708782, fragmentation factor 6.54% Good or bad? Seeing as this filesystem will be three years old in December, that doesn't seem overly bad. I'm currently looking to things like http://lwn.net/Articles/249450/ and http://lwn.net/Articles/242559/ for potential help, fortunately it seems I won't have too long to wait. > Justin. Cheers, Andrew - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
On Wed, 3 Oct 2007, Andrew Clayton wrote: On Wed, 3 Oct 2007 12:48:27 -0400 (EDT), Justin Piszcz wrote: Also if it is software raid, when you make the XFS filesyste, on it, it sets up a proper (and tuned) sunit/swidth, so why would you want to change that? Oh I didn't, the sunit and swidth were set automatically. Do they look sane?. From reading the XFS section of the mount man page, I'm not entirely sure what they specify and certainly wouldn't have any idea what to set them to. Justin. Cheers, Andrew You should not need to set them as mount options unless you are overriding the defaults. Justin. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to create initrd.img to boot LVM-on-RAID0?
"Dean S. Messing" <[EMAIL PROTECTED]> writes: > I'm having the devil of a time trying to boot off > an "LVM-on-RAID0" device on my Fedora 7 system. > > I've created a software RAID-0, defined a Volume Group on in with > (currently) a single logical volume, and copied my entire > installation onto it, modifying the copied fstab to reflect where > the new "/" is. > > I created a new initrd with: > > mkinitrd --preload raid0 --with=raid0 initrd_raid.img 2.6.22.5-76-fc7 > > The LVM modules are getting included in the initrd "for free" because > I'm currently running on a non-raid LV-managed file system. > > I added a stanza to grub.conf for the new initrd.img. > > But the thing won't complete the boot process. >>From the boot messages it appears to not > be starting the array, so when it goes to scan for LVs it doesn't > find the one that's sitting on top of the array where root lives. Maybe your lvm.conf filters it out or the devices are missing to access it? > Are there instructions for how to make this work? I've googled for > a couple of hours, tried a bunch of stuff, but can't get it to > work. From what I've read I suspect I must hand-tweek the "init" > file in the initrd. > > Surely there is "a right way" to do this. Install debian, live happily ever after. :) By the way, why bother with raid0? lvm can do striping on its own saving you one layer alltogether and you already have lvm working right. Why needlessly add problems to your working system? > (And, yes, my /boot partition is an ordinary device, not involved with > the RAID0 or the LVs). > > Dean MfG Goswin - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
Andrew Clayton <[EMAIL PROTECTED]> writes: > Hi, > > Hardware: > > Dual Opteron 2GHz cpus. 2GB RAM. 4 x 250GB SATA hard drives. 1 (root file > system) is connected to the onboard Silicon Image 3114 controller. The other > 3 (/home) are in a software RAID 5 connected to a PCI Silicon Image 3124 > card. I moved the 3 raid disks off the on board controller onto the card the > other day to see if that would help, it didn't. I would think the onboard controller is connected to the north or south bridge and possibly hooked directly into the hyper transport. The extra controler is PCI so you are limited to theoretical 128MiB/s. For me the onboard chips do much better (though at higher cpu cost) than pci cards. MfG Goswin - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
How to create initrd.img to boot LVM-on-RAID0?
I'm having the devil of a time trying to boot off an "LVM-on-RAID0" device on my Fedora 7 system. I've created a software RAID-0, defined a Volume Group on in with (currently) a single logical volume, and copied my entire installation onto it, modifying the copied fstab to reflect where the new "/" is. I created a new initrd with: mkinitrd --preload raid0 --with=raid0 initrd_raid.img 2.6.22.5-76-fc7 The LVM modules are getting included in the initrd "for free" because I'm currently running on a non-raid LV-managed file system. I added a stanza to grub.conf for the new initrd.img. But the thing won't complete the boot process. >From the boot messages it appears to not be starting the array, so when it goes to scan for LVs it doesn't find the one that's sitting on top of the array where root lives. Are there instructions for how to make this work? I've googled for a couple of hours, tried a bunch of stuff, but can't get it to work. From what I've read I suspect I must hand-tweek the "init" file in the initrd. Surely there is "a right way" to do this. (And, yes, my /boot partition is an ordinary device, not involved with the RAID0 or the LVs). Dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID 5 performance issue.
Also if it is software raid, when you make the XFS filesyste, on it, it sets up a proper (and tuned) sunit/swidth, so why would you want to change that? Justin. On Wed, 3 Oct 2007, Justin Piszcz wrote: Have you checked fragmentation? xfs_db -c frag -f /dev/md3 What does this report? Justin. On Wed, 3 Oct 2007, Andrew Clayton wrote: Hi, Hardware: Dual Opteron 2GHz cpus. 2GB RAM. 4 x 250GB SATA hard drives. 1 (root file system) is connected to the onboard Silicon Image 3114 controller. The other 3 (/home) are in a software RAID 5 connected to a PCI Silicon Image 3124 card. I moved the 3 raid disks off the on board controller onto the card the other day to see if that would help, it didn't. Software: Fedora Core 6, 2.6.23-rc9 kernel. Array/fs details: Filesystems are XFS FilesystemTypeSize Used Avail Use% Mounted on /dev/sda2 xfs 20G 5.6G 14G 29% / /dev/sda5 xfs213G 3.6G 209G 2% /data none tmpfs 1008M 0 1008M 0% /dev/shm /dev/md0 xfs466G 237G 229G 51% /home /dev/md0 is currently mounted with the following options noatime,logbufs=8,sunit=512,swidth=1024 sunit and swidth seem to be automatically set. xfs_info shows meta-data=/dev/md0 isize=256agcount=16, agsize=7631168 blks = sectsz=4096 attr=1 data = bsize=4096 blocks=122097920, imaxpct=25 = sunit=64 swidth=128 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=2 = sectsz=4096 sunit=1 blks, lazy-count=0 realtime =none extsz=524288 blocks=0, rtextents=0 The array has a 256k chunk size using left-symmetric layout. /sys/block/md0/md/stripe_cache_size is currently at 4096 (upping this from 256, alleviates the problem at best) I also have currently set /sys/block/sd[bcd]/queue/nr_requests to 512 (doesn't seem to have made any difference) Also blockdev --setra 8192 /dev/sd[bcd] also tried 16384 and 32768 IO scheduler is cfq for all devices. This machine acts as a file server for about 11 workstations. /home (the software RAID 5) is exported over NFS where by the clients mount their home directories (using autofs). I set it up about 3 years ago and it has been fine. However earlier this year we started noticing application stalls. e.g firefox would become unrepsonsive and the window would grey out (under Compiz), this typically lasts 2-4 seconds. During these stalls, I see the below iostat activity (taken at 2 second intervals on the file server). High iowait, high await's. The stripe_cache_active max's out and things kind of grind to halt for a few seconds until the stripe_cache_active starts shrinking. avg-cpu: %user %nice %system %iowait %steal %idle 0.000.000.000.250.00 99.75 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 5.47 0.0040.8014.91 0.059.73 7.18 3.93 sdb 0.00 0.00 1.49 1.49 5.97 9.9510.67 0.06 18.50 9.00 2.69 sdc 0.00 0.00 0.00 2.99 0.0015.9210.67 0.014.17 4.17 1.24 sdd 0.00 0.00 0.50 2.49 1.9913.9310.67 0.025.67 5.67 1.69 md0 0.00 0.00 0.00 1.99 0.00 7.96 8.00 0.000.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.250.005.241.500.00 93.02 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 12.50 0.0085.7513.72 0.129.60 6.28 7.85 sdb 182.50 275.00 114.00 17.50 986.0082.0016.24 337.03 660.64 6.06 79.70 sdc 171.00 269.50 117.00 20.00 1012.0094.0016.15 315.35 677.73 5.86 80.25 sdd 149.00 278.00 107.00 18.50 940.0084.0016.32 311.83 705.33 6.33 79.40 md0 0.00 0.00 0.00 1012.00 0.00 8090.0015.99 0.000.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.000.001.50 44.610.00 53.88 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 1.00 0.00 4.25 8.50 0.000.00 0.00 0.00 sdb 168.5064.00 129.50 58.00 1114.00 508.0017.30 645.37 1272.90 5.34 100.05 sdc 194.0076.50 141.50 43.00 1232.00 360.0017.26 664.01 916.30 5.42 100.05 sd
Re: RAID 5 performance issue.
Have you checked fragmentation? xfs_db -c frag -f /dev/md3 What does this report? Justin. On Wed, 3 Oct 2007, Andrew Clayton wrote: Hi, Hardware: Dual Opteron 2GHz cpus. 2GB RAM. 4 x 250GB SATA hard drives. 1 (root file system) is connected to the onboard Silicon Image 3114 controller. The other 3 (/home) are in a software RAID 5 connected to a PCI Silicon Image 3124 card. I moved the 3 raid disks off the on board controller onto the card the other day to see if that would help, it didn't. Software: Fedora Core 6, 2.6.23-rc9 kernel. Array/fs details: Filesystems are XFS FilesystemTypeSize Used Avail Use% Mounted on /dev/sda2 xfs 20G 5.6G 14G 29% / /dev/sda5 xfs213G 3.6G 209G 2% /data none tmpfs 1008M 0 1008M 0% /dev/shm /dev/md0 xfs466G 237G 229G 51% /home /dev/md0 is currently mounted with the following options noatime,logbufs=8,sunit=512,swidth=1024 sunit and swidth seem to be automatically set. xfs_info shows meta-data=/dev/md0 isize=256agcount=16, agsize=7631168 blks = sectsz=4096 attr=1 data = bsize=4096 blocks=122097920, imaxpct=25 = sunit=64 swidth=128 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=2 = sectsz=4096 sunit=1 blks, lazy-count=0 realtime =none extsz=524288 blocks=0, rtextents=0 The array has a 256k chunk size using left-symmetric layout. /sys/block/md0/md/stripe_cache_size is currently at 4096 (upping this from 256, alleviates the problem at best) I also have currently set /sys/block/sd[bcd]/queue/nr_requests to 512 (doesn't seem to have made any difference) Also blockdev --setra 8192 /dev/sd[bcd] also tried 16384 and 32768 IO scheduler is cfq for all devices. This machine acts as a file server for about 11 workstations. /home (the software RAID 5) is exported over NFS where by the clients mount their home directories (using autofs). I set it up about 3 years ago and it has been fine. However earlier this year we started noticing application stalls. e.g firefox would become unrepsonsive and the window would grey out (under Compiz), this typically lasts 2-4 seconds. During these stalls, I see the below iostat activity (taken at 2 second intervals on the file server). High iowait, high await's. The stripe_cache_active max's out and things kind of grind to halt for a few seconds until the stripe_cache_active starts shrinking. avg-cpu: %user %nice %system %iowait %steal %idle 0.000.000.000.250.00 99.75 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 5.47 0.0040.8014.91 0.05 9.73 7.18 3.93 sdb 0.00 0.00 1.49 1.49 5.97 9.9510.67 0.06 18.50 9.00 2.69 sdc 0.00 0.00 0.00 2.99 0.0015.9210.67 0.01 4.17 4.17 1.24 sdd 0.00 0.00 0.50 2.49 1.9913.9310.67 0.02 5.67 5.67 1.69 md0 0.00 0.00 0.00 1.99 0.00 7.96 8.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.250.005.241.500.00 93.02 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 12.50 0.0085.7513.72 0.12 9.60 6.28 7.85 sdb 182.50 275.00 114.00 17.50 986.0082.0016.24 337.03 660.64 6.06 79.70 sdc 171.00 269.50 117.00 20.00 1012.0094.0016.15 315.35 677.73 5.86 80.25 sdd 149.00 278.00 107.00 18.50 940.0084.0016.32 311.83 705.33 6.33 79.40 md0 0.00 0.00 0.00 1012.00 0.00 8090.0015.99 0.000.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.000.001.50 44.610.00 53.88 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 1.00 0.00 4.25 8.50 0.00 0.00 0.00 0.00 sdb 168.5064.00 129.50 58.00 1114.00 508.0017.30 645.37 1272.90 5.34 100.05 sdc 194.0076.50 141.50 43.00 1232.00 360.0017.26 664.01 916.30 5.42 100.05 sdd 172.0090.50 114.50 50.00 996.00 456.0017.65 662.54 977.28 6.08 100.05 md0 0.00 0.00 0.50 8.00 2.0032.00 8
RAID 5 performance issue.
Hi, Hardware: Dual Opteron 2GHz cpus. 2GB RAM. 4 x 250GB SATA hard drives. 1 (root file system) is connected to the onboard Silicon Image 3114 controller. The other 3 (/home) are in a software RAID 5 connected to a PCI Silicon Image 3124 card. I moved the 3 raid disks off the on board controller onto the card the other day to see if that would help, it didn't. Software: Fedora Core 6, 2.6.23-rc9 kernel. Array/fs details: Filesystems are XFS FilesystemTypeSize Used Avail Use% Mounted on /dev/sda2 xfs 20G 5.6G 14G 29% / /dev/sda5 xfs213G 3.6G 209G 2% /data none tmpfs 1008M 0 1008M 0% /dev/shm /dev/md0 xfs466G 237G 229G 51% /home /dev/md0 is currently mounted with the following options noatime,logbufs=8,sunit=512,swidth=1024 sunit and swidth seem to be automatically set. xfs_info shows meta-data=/dev/md0 isize=256agcount=16, agsize=7631168 blks = sectsz=4096 attr=1 data = bsize=4096 blocks=122097920, imaxpct=25 = sunit=64 swidth=128 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=2 = sectsz=4096 sunit=1 blks, lazy-count=0 realtime =none extsz=524288 blocks=0, rtextents=0 The array has a 256k chunk size using left-symmetric layout. /sys/block/md0/md/stripe_cache_size is currently at 4096 (upping this from 256, alleviates the problem at best) I also have currently set /sys/block/sd[bcd]/queue/nr_requests to 512 (doesn't seem to have made any difference) Also blockdev --setra 8192 /dev/sd[bcd] also tried 16384 and 32768 IO scheduler is cfq for all devices. This machine acts as a file server for about 11 workstations. /home (the software RAID 5) is exported over NFS where by the clients mount their home directories (using autofs). I set it up about 3 years ago and it has been fine. However earlier this year we started noticing application stalls. e.g firefox would become unrepsonsive and the window would grey out (under Compiz), this typically lasts 2-4 seconds. During these stalls, I see the below iostat activity (taken at 2 second intervals on the file server). High iowait, high await's. The stripe_cache_active max's out and things kind of grind to halt for a few seconds until the stripe_cache_active starts shrinking. avg-cpu: %user %nice %system %iowait %steal %idle 0.000.000.000.250.00 99.75 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 5.47 0.0040.8014.91 0.05 9.73 7.18 3.93 sdb 0.00 0.00 1.49 1.49 5.97 9.9510.67 0.06 18.50 9.00 2.69 sdc 0.00 0.00 0.00 2.99 0.0015.9210.67 0.01 4.17 4.17 1.24 sdd 0.00 0.00 0.50 2.49 1.9913.9310.67 0.02 5.67 5.67 1.69 md0 0.00 0.00 0.00 1.99 0.00 7.96 8.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.250.005.241.500.00 93.02 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 12.50 0.0085.7513.72 0.12 9.60 6.28 7.85 sdb 182.50 275.00 114.00 17.50 986.0082.0016.24 337.03 660.64 6.06 79.70 sdc 171.00 269.50 117.00 20.00 1012.0094.0016.15 315.35 677.73 5.86 80.25 sdd 149.00 278.00 107.00 18.50 940.0084.0016.32 311.83 705.33 6.33 79.40 md0 0.00 0.00 0.00 1012.00 0.00 8090.0015.99 0.000.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.000.001.50 44.610.00 53.88 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 1.00 0.00 4.25 8.50 0.00 0.00 0.00 0.00 sdb 168.5064.00 129.50 58.00 1114.00 508.0017.30 645.37 1272.90 5.34 100.05 sdc 194.0076.50 141.50 43.00 1232.00 360.0017.26 664.01 916.30 5.42 100.05 sdd 172.0090.50 114.50 50.00 996.00 456.0017.65 662.54 977.28 6.08 100.05 md0 0.00 0.00 0.50 8.00 2.0032.00 8.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Re: Journalling filesystem corruption fixed in between?
Rustedt, Florian wrote: > Hello list, > > some folks reported severe filesystem-crashes with ext3 and reiserfs on > mdraid level 1 and 5. I guess much more strong evidience and details are needed. Without any additional information I for one can only make a (not-so-pleasant) guess about those "some folks", nothing more. We're running several dozens of systems on raid1s and raid5s since 2.4 kernel (and some since 2.2 if memory serves, with an additional patch for raid functionality), -- nothing except of usual mostly hardware problems since that. And many other people use linux raid and especially ext3 file- system in production on large boxes with good load -- such a corruption, be it not a particular system specific (due to, for example, a bad ram or faulty controller or whatever), should cause alot of messages here @linux-raid and elsewhere. /mjt - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html