Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
David Chinner wrote: On Wed, Jun 27, 2007 at 07:20:42PM -0400, Justin Piszcz wrote: For drives with 16MB of cache (in this case, raptors). That's four (4) drives, right? I'm pretty sure he's using 10 - email a few days back... Justin Piszcz wrote: Running test with 10 RAPTOR 150 hard drives, expect it to take awhile until I get the results, avg them etc. :) If so, how do you get a block read rate of 578MB/s from 4 drives? That's 145MB/s per drive Which gives a far more reasonable 60MB/s per drive... David - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
mdadm --create \ --verbose /dev/md3 \ --level=5 \ --raid-devices=10 \ --chunk=1024 \ --force \ --run /dev/sd[cdefghijkl]1 Justin. On Thu, 28 Jun 2007, Peter Rabbitson wrote: Justin Piszcz wrote: The results speak for themselves: http://home.comcast.net/~jpiszcz/chunk/index.html What is the array layout (-l ? -n ? -p ?) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
10 disks total. Justin. On Thu, 28 Jun 2007, David Chinner wrote: On Wed, Jun 27, 2007 at 07:20:42PM -0400, Justin Piszcz wrote: For drives with 16MB of cache (in this case, raptors). That's four (4) drives, right? If so, how do you get a block read rate of 578MB/s from 4 drives? That's 145MB/s per drive Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm usage: creating arrays with helpful names?
(back on list for google's benefit ;) and because there are some good questions and I don't know all the answers... ) Oh, and Neil 'cos there may be a bug ... Richard Michael wrote: On Wed, Jun 27, 2007 at 08:49:22AM +0100, David Greaves wrote: http://linux-raid.osdl.org/index.php/Partitionable Thanks. I didn't know this site existed (Googling even just 'mdadm' doesn't yield it in the first 100 results), and it's helpful. Good ... I got permission to wikify the 'official' linux raid FAQ but it takes time (and motivation!) to update it :) Hopefully it will snowball as people who use it then contribute back hint ;) As it becomes more valuable to people then more links will be created and Google will notice... What if don't want a partitioned array? I simply want the name to be nicer than the /dev/mdX or /dev/md/XX style. (p1 still gives me /dev/nicename /dev/nicename0, as your page indicates.) --auto md mdadm --create /dev/strawberry --auto md ... [EMAIL PROTECTED]:/tmp # mdadm --detail /dev/strawberry /dev/strawberry: Version : 00.90.03 Creation Time : Thu Jun 28 08:25:06 2007 Raid Level : raid4 Also, when I use --create /dev/nicename --auto=p1 (for example), I also see /dev/md_d126 created. Why? There is then a /sys/block/md_d126 entry (presumably created by the md driver), but no /sys/block/nicename entry. Why? Not sure who creates this, mdadm or udev The code isn't that hard to read and you sound like you'd follow it if you fancied a skim-read... I too would expect that there should be a /sys/block/nicename - is this a bug Neil? These options don't see a lot of use - I recently came across a bug in the --auto pX option... Finally --stop /dev/nicename doesn't remove any of the aforementioned /dev or /sys entries. I don't suppose that it should, but an mdadm command to do this would be helpful. So, how do I remove the oddly named /sys entries? (I removed the /dev entries with rm.) man mdadm indicates --stop releases all resources, but it doesn't (and probably shouldn't). rm ! '--stop' with mdadm does release the 'resources', ie the components you used. It doesn't remove the array. There is no delete - I guess since an rm is just as effective unless you use a nicename... [I think there should be a symmetry to the mdadm options --create/--delete and --start/--stop. It's *convenient* --create also starts the array, but this conflates the issue a bit..] I want to stop and completely remove all trace of the array. (Especially as I'm experimenting with this over loopback, and stuff hanging around irritates the lo driver.) You're possibly mixing two things up here... Releasing the resources with a --stop would let you re-use a lo device in another array. You don't _need_ --delete (or rm). However md does write superblocks to the components and *mdadm* warns you that the loopback has a valid superblock.. mdadm: /dev/loop1 appears to be part of a raid array: level=raid4 devices=6 ctime=Thu Jun 21 09:46:27 2007 [hmm, I can see why you may think it's part of an 'active' array] You could do mdadm --zero-superblock to clean the component or just say yes when mdadm asks you to continue. see: # mdadm --create /dev/strawberry --auto md --level=4 -n 6 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 /dev/loop6 mdadm: /dev/loop1 appears to be part of a raid array: level=raid4 devices=6 ctime=Thu Jun 28 08:25:06 2007 blah Continue creating array? yes mdadm: array /dev/strawberry started. # mdadm --stop /dev/strawberry mdadm: stopped /dev/strawberry # mdadm --create /dev/strawberry --auto md --level=4 -n 6 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 /dev/loop6 mdadm: /dev/loop1 appears to be part of a raid array: level=raid4 devices=6 ctime=Thu Jun 28 09:07:29 2007 blah Continue creating array? yes mdadm: array /dev/strawberry started. David - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
Justin Piszcz wrote: mdadm --create \ --verbose /dev/md3 \ --level=5 \ --raid-devices=10 \ --chunk=1024 \ --force \ --run /dev/sd[cdefghijkl]1 Justin. Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: mdadm --create \ --level=10 \ --chunk=1024 \ --raid-devices=4 \ --layout=f3 \ ... Could it be attributed to XFS itself? Peter - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, 28 Jun 2007, Peter Rabbitson wrote: Justin Piszcz wrote: mdadm --create \ --verbose /dev/md3 \ --level=5 \ --raid-devices=10 \ --chunk=1024 \ --force \ --run /dev/sd[cdefghijkl]1 Justin. Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: mdadm --create \ --level=10 \ --chunk=1024 \ --raid-devices=4 \ --layout=f3 \ ... Could it be attributed to XFS itself? Peter Good question, by the way how much cache do the drives have that you are testing with? Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
Justin Piszcz wrote: On Thu, 28 Jun 2007, Peter Rabbitson wrote: Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: ... Could it be attributed to XFS itself? Peter Good question, by the way how much cache do the drives have that you are testing with? I believe 8MB, but I am not sure I am looking at the right number: [EMAIL PROTECTED]:~# hdparm -i /dev/sda /dev/sda: Model=aMtxro7 2Y050M , FwRev=AY5RH10W, SerialNo=6YB6Z7E4 Config={ Fixed } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=7936kB, MaxMultSect=16, MultSect=?0? CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 AdvancedPM=yes: disabled (255) WriteCache=enabled Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 * signifies the current active mode [EMAIL PROTECTED]:~# 1M chunk consistently delivered best performance with: o A plain dumb dd run o bonnie o two bonnie threads o iozone with 4 threads My RA is set at 256 for the drives and 16384 for the array (128k and 8M respectively) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, 28 Jun 2007, Peter Rabbitson wrote: Justin Piszcz wrote: On Thu, 28 Jun 2007, Peter Rabbitson wrote: Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: ... Could it be attributed to XFS itself? Peter Good question, by the way how much cache do the drives have that you are testing with? I believe 8MB, but I am not sure I am looking at the right number: [EMAIL PROTECTED]:~# hdparm -i /dev/sda /dev/sda: Model=aMtxro7 2Y050M , FwRev=AY5RH10W, SerialNo=6YB6Z7E4 Config={ Fixed } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=7936kB, MaxMultSect=16, MultSect=?0? CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 AdvancedPM=yes: disabled (255) WriteCache=enabled Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 * signifies the current active mode [EMAIL PROTECTED]:~# 1M chunk consistently delivered best performance with: o A plain dumb dd run o bonnie o two bonnie threads o iozone with 4 threads My RA is set at 256 for the drives and 16384 for the array (128k and 8M respectively) 8MB yup: BuffSize=7936kB. My read ahead is set to 64 megabytes and 16384 for the stripe_size_cache. Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, 28 Jun 2007, Peter Rabbitson wrote: Justin Piszcz wrote: On Thu, 28 Jun 2007, Peter Rabbitson wrote: Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: ... Could it be attributed to XFS itself? Peter Good question, by the way how much cache do the drives have that you are testing with? I believe 8MB, but I am not sure I am looking at the right number: [EMAIL PROTECTED]:~# hdparm -i /dev/sda /dev/sda: Model=aMtxro7 2Y050M , FwRev=AY5RH10W, SerialNo=6YB6Z7E4 Config={ Fixed } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=7936kB, MaxMultSect=16, MultSect=?0? CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 AdvancedPM=yes: disabled (255) WriteCache=enabled Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 * signifies the current active mode [EMAIL PROTECTED]:~# 1M chunk consistently delivered best performance with: o A plain dumb dd run o bonnie o two bonnie threads o iozone with 4 threads My RA is set at 256 for the drives and 16384 for the array (128k and 8M respectively) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Have you also tried tuning: 1. nr_requests per each disk? I noticed 10-20 seconds faster speed (overall) with bonnie tests when I set all disks in the array to 512k. echo 512 /sys/block/$i/queue/nr_requests 2. Also disable NCQ. echo 1 /sys/block/$i/device/queue_depth - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, 28 Jun 2007, Justin Piszcz wrote: On Thu, 28 Jun 2007, Peter Rabbitson wrote: Justin Piszcz wrote: On Thu, 28 Jun 2007, Peter Rabbitson wrote: Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: ... Could it be attributed to XFS itself? Peter Good question, by the way how much cache do the drives have that you are testing with? I believe 8MB, but I am not sure I am looking at the right number: [EMAIL PROTECTED]:~# hdparm -i /dev/sda /dev/sda: Model=aMtxro7 2Y050M , FwRev=AY5RH10W, SerialNo=6YB6Z7E4 Config={ Fixed } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=7936kB, MaxMultSect=16, MultSect=?0? CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 AdvancedPM=yes: disabled (255) WriteCache=enabled Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 * signifies the current active mode [EMAIL PROTECTED]:~# 1M chunk consistently delivered best performance with: o A plain dumb dd run o bonnie o two bonnie threads o iozone with 4 threads My RA is set at 256 for the drives and 16384 for the array (128k and 8M respectively) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Have you also tried tuning: 1. nr_requests per each disk? I noticed 10-20 seconds faster speed (overall) with bonnie tests when I set all disks in the array to 512k. echo 512 /sys/block/$i/queue/nr_requests 2. Also disable NCQ. echo 1 /sys/block/$i/device/queue_depth - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Also per XFS: noatime,logbufs=8 I am testing various options, so far the logbufs=8 option is detrimental, making the entire bonnie++ run a little slower. I believe the default is 2 and it uses 32k(?) buffers (shown below) if the blocksize is less than 16K I am trying with: noatime,logbufs=8,logbsize=262144 currently. logbufs=value Set the number of in-memory log buffers. Valid numbers range from 2-8 inclusive. The default value is 8 buffers for filesys- tems with a blocksize of 64K, 4 buffers for filesystems with a blocksize of 32K, 3 buffers for filesystems with a blocksize of 16K, and 2 buffers for all other configurations. Increasing the number of buffers may increase performance on some workloads at the cost of the memory used for the additional log buffers and their associated control structures. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, Jun 28, 2007 at 10:24:54AM +0200, Peter Rabbitson wrote: Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: mdadm --create \ --level=10 \ --chunk=1024 \ --raid-devices=4 \ --layout=f3 \ ... Could it be attributed to XFS itself? Sort of.. /dev/md4: Version : 00.90.03 Raid Level : raid5 Raid Devices : 4 Total Devices : 4 Preferred Minor : 4 Active Devices : 4 Working Devices : 4 Layout : left-symmetric Chunk Size : 256K This means there are 3x 256k for the user data.. Now I had to carefully tune the XFS bsize/sunit/swidth to match that: meta-data=/dev/DataDisk/lvol0isize=256agcount=32, agsize=7325824 blks = sectsz=512 attr=1 data = bsize=4096 blocks=234426368, imaxpct=25 = sunit=64 swidth=192 blks, unwritten=1 ... That is, 4k * 64 = 256k, and 64 * 3 = 192 With that, bulk writing on the file system runs without need to read back blocks of disk-space to calculate RAID5 parity data because the filesystem's idea of block does not align with RAID5 surface. I do have LVM in between the MD-RAID5 and XFS, so I did also align the LVM to that 3 * 256k. Doing this alignment thing did boost write performance by nearly a factor of 2 from mkfs.xfs with default parameters. With very wide RAID5, like the original question... I would find it very surprising if the alignment of upper layers to MD-RAID level would not be important there as well. Very small continuous writing does not make good use of disk mechanism, (seek time, rotation delay), so something in order of 128k-1024k will speed things up -- presuming that when you are writing, you are doing it many MB at the time. Database transactions are a lot smaller, and are indeed harmed by such large megachunk-IO oriented surfaces. RAID-levels 0 and 1 (and 10) do not have the need of reading back parts of the surface because a subset of it was not altered by incoming write. Some DB application on top of the filesystem would benefit if we had a way for it to ask about these alignment boundary issues, so it could read whole alignment block even though it writes out only a subset of it. (Theory being that those same blocks would also exist in memory cache and thus be available for write-back parity calculation.) Peter /Matti Aarnio - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, 28 Jun 2007, Matti Aarnio wrote: On Thu, Jun 28, 2007 at 10:24:54AM +0200, Peter Rabbitson wrote: Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: mdadm --create \ --level=10 \ --chunk=1024 \ --raid-devices=4 \ --layout=f3 \ ... Could it be attributed to XFS itself? If anyone is interested, I also did a 2048k, 1024k definitely results in the most optimal configuration. p34-128k-chunk,15696M,77236.3,99,445653,86.,192267,34.,78773.7,99,524463,41,594.9,0,16:10:16/64,1298.67,10.6667,5964.33,17.,3035.67,18.,1512,13.6667,5334.33,16,2634.67,19 p34-512k-chunk,15696M,78383,99,436842,86,162969,27,79624,99,486892,38,583.0,0,16:10:16/64,2019,17,9715,29,4272,23,2250,22,17095,45,3691,30 p34-1024k-chunk,15696M,77672.3,99,455267,87.,183772,29.6667,79601.3,99,578225,43.,595.933,0,16:10:16/64,2085.67,18,12953,39,3908.33,23.,2375.33,23.,18492,51.6667,3388.33,27 p34-2048k-chunk,15696M,76822,98,435439,86,164140,26.,77065.3,99,582948,44,631.467,0,16:10:16/64,1795.33,15,17612.3,49.,3668.67,20.6667,2040.67,19,13384,38,3255.33,25 p34-4096k-chunk,15696M,33791.1,43.5556,176630,37.,72235.1,11.5556,34424.9,44,247925,18.,271.644,0,16:10:16/64,560,4.9,2928,8.9,1039.56,5.8,571.556,5.3,1729.78,5.3,1289.33,9.3 http://home.comcast.net/~jpiszcz/chunk/ Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid=noautodetect is apparently ignored?
On Wed, 2007-06-27 at 08:48 -0700, Andrew Burgess wrote: Odd Maybe you have an initrd which is loading md as a module, then running raidautorun or similar? .. I suspect that the last comment is the clue, after pivotroot I bet it runs another init, not from the boot/initrd images, but from the init.d in the root filesystem. You are absolutely correct. On Fedora core5, in rc.sysinit echo raidautorun /dev/md0 | nash --quiet if [ -f /etc/mdadm.conf ]; then /sbin/mdadm -A -s fi But my original observation was correct. The noautodetect was/is being ignored because (whenever) there is an initrd. FC5 doesn't support raid root partitions (mkinitrd doesn't put the right stuff in initrd), but FC7 tries to. I have upgraded and things are mostly correct. Albeit, FC7 doesn't support my nested raid configuration and so it took some coaxing to get the the upgrade done and a hack to coax mkinitrd into doing the right thing. Putting mdadm.conf on a floppy disk plus a little intervention with a virtual console early in the upgrade process worked wonders. One quick way to test this is to boot with init=/bin/sh This lets all the initrd stuff run but nothing from the root filesystem. Neat idea. I'll try and remember that for the future. -- Ian Dall [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm usage: creating arrays with helpful names?
On Thu, Jun 28, 2007 at 09:12:56AM +0100, David Greaves wrote: (back on list for google's benefit ;) and because there are some good questions and I don't know all the answers... ) Thanks, I didn't realize I didn't 'reply-all' to stay on the list. Hopefully it will snowball as people who use it then contribute back hint ;) I will, I'm also keeping notes and changes to the man page. :) --auto md Ah. Thanks for the example(s). Also, when I use --create /dev/nicename --auto=p1 (for example), I also see /dev/md_d126 created. Why? There is then a /sys/block/md_d126 entry (presumably created by the md driver), but no /sys/block/nicename entry. Why? Not sure who creates this, mdadm or udev I'm guessing the kernel's md driver creates it; neither mdmadm nor udev (just as the kernel creates, for example, sd* disk entries in /sys, but udev creates the nice entries in /dev). The code isn't that hard to read and you sound like you'd follow it if you fancied a skim-read... I read it for the --create option to see who created /dev/mdXX. :) I'll take another look. Thanks David. Cheers. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Does --write-behind= have to be done at create time?
I was wanting to try out the --write-behind option. I have a raid1 with bitmaps and write-mostly enabled, which are all the pre-requisites, I think. It would be nice if you could tweak this parameter on a live array, but failing that, it is hard to see why it couldn't be done at assemble time. mdadm wont let me though. Is this a fundamental limitation? A related question, if I do recreate the same array, with exactly the same parameters (except for the write-behind value) will my data still be OK? -- Ian Dall [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume
Hi! FWIW, I'm on record stating that sync is not sufficient to quiesce an XFS filesystem for a suspend/resume to work safely and have argued that the only Hmm, so XFS writes to disk even when its threads are frozen? safe thing to do is freeze the filesystem before suspend and thaw it after resume. This is why I originally asked you to test that with the other problem Could you add that to the XFS threads if it is really required? They do know that they are being frozen for suspend. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume
On Wednesday, 27 June 2007 22:49, Pavel Machek wrote: Hi! FWIW, I'm on record stating that sync is not sufficient to quiesce an XFS filesystem for a suspend/resume to work safely and have argued that the only Hmm, so XFS writes to disk even when its threads are frozen? safe thing to do is freeze the filesystem before suspend and thaw it after resume. This is why I originally asked you to test that with the other problem Could you add that to the XFS threads if it is really required? They do know that they are being frozen for suspend. Well, do you remember the workqueues? They are still nonfreezable. Greetings, Rafael -- Premature optimization is the root of all evil. - Donald Knuth - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: spare not becoming active
Number Major Minor RaidDevice State 0 000 removed 1 8 341 active sync /dev/sdc2 2 002 removed 3 8 82- spare /dev/sdf2 4 8 66- spare /dev/sde2 5 8 50- faulty spare 6 8 18- faulty spare I was trying a couple things, but never got to change this status. At one point I stopped the array and restarted it, and it didn't work. (I did it before, so i don't see why...) # /sbin/mdadm -R /dev/md0 mdadm: failed to run array /dev/md0: Invalid argument I'm starting to think the documentation i read was very outdated, or only touched the subject in surface, can you guys recommend a good reading, like a companion to the man page? Thanks, Simon - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
XFS mount option performance on Linux Software RAID 5
Still reviewing but it appears 8 + 256k looks good. p34-noatime-logbufs=2-lbsize=256k,15696M,78172.3,99,450320,86.6667,178683,29,79808,99,565741,42.,610.067,0,16:10:16/64,2362,19.6667,15751.7,46,3993.33,22,2545.67,24.,13976,41,3781.33,28.6667 p34-noatime-logbufs=8-lbsize=256k,15696M,78238,99,455532,86.6667,182382,30,79741.7,99,571631,43,597.633,0,16:10:16/64,3421,29,12130,38.,5943.33,33,3671.33,35.6667,13521.3,41.,5162.33,38. p34-noatime-logbufs=8-lbsize=default,15696M,77872,98.6667,438661,86.6667,179848,29.,79368,99,555999,42,632.733,0.33,16:10:16/64,2090,17.6667,11183,33,3922.67,23,2271.33,22.,11709,35,3391.33,26. p34-noatime-only,15696M,77473,99,449689,86.6667,176960,29.,80186.3,99,568503,42.6667,592.633,0,16:10:16/64,2102,18,15935.3,44.6667,3825.67,22.,2353,23.6667,9727.33,29.,3265,25.6667 http://home.comcast.net/~jpiszcz/chunk/logbufs.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume
On Thu 2007-06-28 17:27:34, Rafael J. Wysocki wrote: On Wednesday, 27 June 2007 22:49, Pavel Machek wrote: Hi! FWIW, I'm on record stating that sync is not sufficient to quiesce an XFS filesystem for a suspend/resume to work safely and have argued that the only Hmm, so XFS writes to disk even when its threads are frozen? safe thing to do is freeze the filesystem before suspend and thaw it after resume. This is why I originally asked you to test that with the other problem Could you add that to the XFS threads if it is really required? They do know that they are being frozen for suspend. Well, do you remember the workqueues? They are still nonfreezable. Oops, that would explain it :-(. Can we make XFS stop using them? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k
On Thu, Jun 28, 2007 at 04:27:15AM -0400, Justin Piszcz wrote: On Thu, 28 Jun 2007, Peter Rabbitson wrote: Justin Piszcz wrote: mdadm --create \ --verbose /dev/md3 \ --level=5 \ --raid-devices=10 \ --chunk=1024 \ --force \ --run /dev/sd[cdefghijkl]1 Justin. Interesting, I came up with the same results (1M chunk being superior) with a completely different raid set with XFS on top: mdadm--create \ --level=10 \ --chunk=1024 \ --raid-devices=4 \ --layout=f3 \ ... Could it be attributed to XFS itself? More likely it's related to the I/O size being sent to the disks. The larger the chunk size, the larger the I/o hitting each disk. I think the maximum I/O size is 512k ATM on x86(_64), so a chunk of 1MB will guarantee that there are maximally sized I/Os being sent to the disk Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume
On Wed, Jun 27, 2007 at 08:49:24PM +, Pavel Machek wrote: Hi! FWIW, I'm on record stating that sync is not sufficient to quiesce an XFS filesystem for a suspend/resume to work safely and have argued that the only Hmm, so XFS writes to disk even when its threads are frozen? They issue async I/O before they sleep and expects processing to be done on I/O completion via workqueues. safe thing to do is freeze the filesystem before suspend and thaw it after resume. This is why I originally asked you to test that with the other problem Could you add that to the XFS threads if it is really required? They do know that they are being frozen for suspend. We don't suspend the threads on a filesystem freeze - they continue run. A filesystem freeze guarantees the filesystem clean and that the in memory state matches what is on disk. It is not possible for the filesytem to issue I/O or have outstanding I/O when it is in the frozen state, so the state of the threads and/or workqueues does not matter because they will be idle. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume
On Fri, Jun 29, 2007 at 12:16:44AM +0200, Rafael J. Wysocki wrote: There are two solutions possible, IMO. One would be to make these workqueues freezable, which is possible, but hacky and Oleg didn't like that very much. The second would be to freeze XFS from within the hibernation code path, using freeze_bdev(). The second is much more likely to work reliably. If freezing the filesystem leaves something in an inconsistent state, then it's something I can reproduce and debug without needing to suspend/resume. FWIW, don't forget you need to thaw the filesystem on resume. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html