2.4.30: bug in file md.c, line 2473

2005-04-20 Thread dean gaudet
i got the following bug from 2.4.30 while trying to hot add a device tonight... i was trying to replace a disk in a 3-way raid1 -- the existing disks are sda, sdb, and i was replacing sdc. each of these disks has 3 partitions, each with a raid1. due to an improper shutdown the raids were

Re: SOLVED: forcing boot ordering of multilevel RAID arrays

2005-08-07 Thread dean gaudet
On Sun, 7 Aug 2005, Trevor Cordes wrote: Any array that is a superset of other arrays (a multilevel array) must set to non-autodetect. Use fdisk to change the parition type to 83 (standard linux), NOT fd (linux raid autodetect). you know i'd be worried setting it to 0x83 will cause troubles

Re: RAID6 Query

2005-08-16 Thread dean gaudet
On Tue, 16 Aug 2005, Colonel Hell wrote: I just went thru a couple of papers describing RAID6. I dunno how relevant this discussion grp is for the qry ...but here I go :) ... I couldnt figure out why is P+Q configuration better over P+q' where q' == P. What I mean is instead of calculating

Re: split RAID1 during backups?

2005-10-24 Thread dean gaudet
On Mon, 24 Oct 2005, Jeff Breidenbach wrote: First of all, if the data is mostly static, rsync might work faster. Any operation that stats the individual files - even to just look at timestamps - takes about two weeks. Therefore it is hard for me to see rsync as a viable solution, even

Re: split RAID1 during backups?

2005-10-24 Thread dean gaudet
On Mon, 24 Oct 2005, Jeff Breidenbach wrote: Dean, the comment about write-mostly is confusing to me. Let's say I somehow marked one of the component drives write-mostly to quiet it down. How do I get at it? Linux will not let me mount the component partition if md0 is also mounted. Do you

Re: s/w raid and bios renumbering HDs

2005-10-31 Thread dean gaudet
On Mon, 31 Oct 2005, Hari Bhaskaran wrote: Hi, I am trying to setup a RAID-1 setup for the boot/root partition. I got the setup working, except what I see with some of my tests leave me less convinced that it is actually working. My system is debian 3.1 and I am not using the raid-setup

Re: s/w raid and bios renumbering HDs

2005-10-31 Thread dean gaudet
On Mon, 31 Oct 2005, Hari Bhaskaran wrote: So that DEVICE paritions line was really supposed to be there? Hehe... I thought it was just a help message and replaced it with DEVICE /dev/hda1 /dev/hdc1 :) you can use DEVICE /dev/hda1 /dev/hdc1 ... but then mdadm scans will only consider those

Re: Still Need Help on mdadm and udev

2005-11-10 Thread dean gaudet
On Thu, 10 Nov 2005, Bill Davidsen wrote: I haven't had a good use for a partitionable device i've used it to have root, swap, and some external xfs/ext3 logs on a single raid1... (the xfs/ext3 logs are for filesystems on another raid5) rather than managing 4 or 5 separate raid1s on the same

Re: Journal-guided Resynchronization for Software RAID

2005-12-02 Thread dean gaudet
On Thu, 1 Dec 2005, Neil Brown wrote: What I would really like is a cheap (Well, not too expensive) board that had at least 100Meg of NVRAM which was addressable on the PCI buss, and an XOR and RAID-6 engine connected to the DMA engine. there's the mythical giga-byte i-ram ... i say mythical

Re: Journal-guided Resynchronization for Software RAID

2005-12-08 Thread dean gaudet
On Mon, 5 Dec 2005, Neil Brown wrote: One of these with built in xor and raid6 would be nice, but I'm not sure I could guarantee a big enough market for them to try convincing them to make one... i wonder if the areca cards http://www.areca.com.tw/ are re-programmable... they seem to have

Re: Updating superblock to reflect new disc locations

2006-01-11 Thread dean gaudet
On Thu, 12 Jan 2006, Neil Brown wrote: On Wednesday January 11, [EMAIL PROTECTED] wrote: Any suggestions would be greatly appreciated. The system's new and not yet in production, so I can reinstall it if I have to, but I'd prefer to be able to fix something as simple as this.

Re: Configuring combination of RAID-1 RAID-5

2006-02-01 Thread dean gaudet
On Tue, 31 Jan 2006, Enrique Garcia Briones wrote: I have read the setting-up for the raid-5 and 1, but I would like to ask you if I can set-up a combined RAID configuration as mentioned above, since all the examples I found upto now just talk of one RAD configuration you can have more than

Re: Raid5 Debian Yaird Woes

2006-02-02 Thread dean gaudet
i've never looked at yaird in detail -- but you can probably use initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will use whichever one of those is installed. i know that initramfs-tools uses mdrun to start the root partition based on its UUID -- and so it should work

Re: Raid5 Debian Yaird Woes

2006-02-02 Thread dean gaudet
On Thu, 2 Feb 2006, dean gaudet wrote: i've never looked at yaird in detail -- but you can probably use initramfs-tools instead of yaird... i take it all back... i just tried initramfs-tools and it failed to boot my system properly... whereas yaird almost got everything right. the main

Re: Raid5 Debian Yaird Woes

2006-02-03 Thread dean gaudet
On Sat, 4 Feb 2006, Lewis Shobbrook wrote: Is there any way to avoid this requirement for input, so that the system skips the missing drive as the raid/initrd system did previously? what boot errors are you getting before it drops you to the root password prompt? is it trying to fsck

Re: Raid5 Debian Yaird Woes

2006-02-06 Thread dean gaudet
On Sun, 5 Feb 2006, Lewis Shobbrook wrote: On Saturday 04 February 2006 11:22 am, you wrote: On Sat, 4 Feb 2006, Lewis Shobbrook wrote: Is there any way to avoid this requirement for input, so that the system skips the missing drive as the raid/initrd system did previously? what boot

Re: NVRAM support

2006-02-10 Thread dean gaudet
On Fri, 10 Feb 2006, Bill Davidsen wrote: Erik Mouw wrote: You could use it for an external journal, or you could use it as a swap device. Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If you do a lot of operations which generate journal entries, file

Re: Auto-assembling arrays using mdadm

2006-03-09 Thread dean gaudet
On Thu, 9 Mar 2006, Sean Puttergill wrote: This is the kind of functionality provided by kernel RAID autodetect. You don't have to have any config information provided in advance. The kernel finds and assembles all arrays on disks with RAID autodetect partition type. I want to do the same

Re: how to clone a disk

2006-03-11 Thread dean gaudet
On Sat, 11 Mar 2006, Ming Zhang wrote: On Sat, 2006-03-11 at 06:53 -0500, Paul M. wrote: Since its raid5 you would be fine just pulling the disk out and letting the raid driver rebuild the array. If you have a hot spare yes, rebuilding is the simplest way. but rebuild will need to read

Re: how to clone a disk

2006-03-11 Thread dean gaudet
On Sat, 11 Mar 2006, Ming Zhang wrote: On Sat, 2006-03-11 at 16:15 -0800, dean gaudet wrote: you're planning to do this while the array is online? that's not safe... unless it's a read-only array... what i plan to do is to pull out the disk (which is ok now but going to die), so

Re: how to clone a disk

2006-03-11 Thread dean gaudet
On Sat, 11 Mar 2006, Ming Zhang wrote: On Sat, 2006-03-11 at 16:31 -0800, dean gaudet wrote: if you fail the disk from the array, or boot without the failing disk, then the event counter in the other superblocks will be updated... and the removed/failed disk will no longer be considered

Re: naming of md devices

2006-03-22 Thread dean gaudet
On Thu, 23 Mar 2006, Nix wrote: Last I heard the Debian initramfs constructs RAID arrays by explicitly specifying the devices that make them up. This is, um, a bad idea: the first time a disk fails or your kernel renumbers them you're in *trouble*. yaird seems to dtrt ... at least in

Re: raid5 that used parity for reads only when degraded

2006-03-24 Thread dean gaudet
On Thu, 23 Mar 2006, Alex Izvorski wrote: Also the cpu load is measured with Andrew Morton's cyclesoak tool which I believe to be quite accurate. there's something cyclesoak does which i'm not sure i agree with: cyclesoak process dirties an array of 100 bytes... so what you're really

Re: raid5 high cpu usage during reads - oprofile results

2006-04-01 Thread dean gaudet
On Sat, 1 Apr 2006, Alex Izvorski wrote: Dean - I think I see what you mean, you're looking at this line in the assembly? 65830 16.8830 : c1f: cmp%rcx,0x28(%rax) yup that's the one... that's probably a fair number of cache (or tlb) misses going on right there. I looked at

Re: md/mdadm fails to properly run on 2.6.15 after upgrading from 2.6.11

2006-04-09 Thread dean gaudet
On Sun, 9 Apr 2006, Marc L. de Bruin wrote: ... Okay, just pressing Control-D continues the boot process and AFAIK the root filesystemen actually isn't corrupt. Running e2fsck returns no errors and booting 2.6.11 works just fine, but I have no clue why it picked the wrong partitions to build

Re: md/mdadm fails to properly run on 2.6.15 after upgrading from 2.6.11

2006-04-10 Thread dean gaudet
On Mon, 10 Apr 2006, Marc L. de Bruin wrote: dean gaudet wrote: initramfs-tools generates an mdrun /dev which starts all the raids it can find... but does not include the mdadm.conf in the initrd so i'm not sure it will necessarily start them in the right minor devices. try doing

Re: md/mdadm fails to properly run on 2.6.15 after upgrading from 2.6.11

2006-04-10 Thread dean gaudet
On Mon, 10 Apr 2006, Marc L. de Bruin wrote: However, all preferred minors are correct, meaning that the output is in sync with what I expected it to be from /etc/mdadm/mdadm.conf. Any other ideas? Just adding /etc/mdadm/mdadm.conf to the initrd does not seem to work, since mdrun seems to

Re: md/mdadm fails to properly run on 2.6.15 after upgrading from 2.6.11

2006-04-11 Thread dean gaudet
On Mon, 10 Apr 2006, Marc L. de Bruin wrote: dean gaudet wrote: On Mon, 10 Apr 2006, Marc L. de Bruin wrote: However, all preferred minors are correct, meaning that the output is in sync with what I expected it to be from /etc/mdadm/mdadm.conf. Any other ideas? Just adding

forcing a read on a known bad block

2006-04-11 Thread dean gaudet
hey Neil... i've been wanting to test out the reconstruct-on-read-error code... and i've had two chances to do so, but haven't be able to force md to read the appropriate block to trigger the code. i had two disks with SMART Current_Pending_Sector 0 (which indicates pending read error) and i

proactive raid5 disk replacement success (using bitmap + raid1)

2006-04-23 Thread dean gaudet
i had a disk in a raid5 which i wanted to clone onto the hot spare... without going offline and without long periods without redundancy. a few folks have discussed using bitmaps and temporary (superblockless) raid1 mappings to do this... i'm not sure anyone has tried / reported success

Re: raid5 hang on get_active_stripe

2006-05-17 Thread dean gaudet
On Thu, 11 May 2006, dean gaudet wrote: On Tue, 14 Mar 2006, Neil Brown wrote: On Monday March 13, [EMAIL PROTECTED] wrote: I just experienced some kind of lockup accessing my 8-drive raid5 (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but now processes that try

Re: raid5 hang on get_active_stripe

2006-05-26 Thread dean gaudet
On Tue, 23 May 2006, Neil Brown wrote: I've spent all morning looking at this and while I cannot see what is happening I did find a couple of small bugs, so that is good... I've attached three patches. The first fix two small bugs (I think). The last adds some extra information to

Re: raid5 hang on get_active_stripe

2006-05-26 Thread dean gaudet
On Sat, 27 May 2006, Neil Brown wrote: On Friday May 26, [EMAIL PROTECTED] wrote: On Tue, 23 May 2006, Neil Brown wrote: i applied them against 2.6.16.18 and two days later i got my first hang... below is the stripe_cache foo. thanks -dean neemlark:~# cd /sys/block/md4/md/

Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)

2006-05-28 Thread dean gaudet
On Sun, 28 May 2006, Luca Berra wrote: - mdadm-2.5-rand.patch Posix dictates rand() versus bsd random() function, and dietlibc deprecated random(), so switch to srand()/rand() and make everybody happy. fwiw... lots of rand()s tend to suck... and RAND_MAX may not be large enough for this

Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)

2006-05-28 Thread dean gaudet
On Sun, 28 May 2006, Luca Berra wrote: dietlibc rand() and random() are the same function. but random will throw a warning saying it is deprecated. that's terribly obnoxious... it's never going to be deprecated, there are only approximately a bazillion programs using random(). -dean - To

Re: raid5 hang on get_active_stripe

2006-05-29 Thread dean gaudet
On Sun, 28 May 2006, Neil Brown wrote: The following patch adds some more tracing to raid5, and might fix a subtle bug in ll_rw_blk, though it is an incredible long shot that this could be affecting raid5 (if it is, I'll have to assume there is another bug somewhere). It certainly doesn't

Re: raid5 hang on get_active_stripe

2006-05-30 Thread dean gaudet
On Wed, 31 May 2006, Neil Brown wrote: On Tuesday May 30, [EMAIL PROTECTED] wrote: actually i think the rate is higher... i'm not sure why, but klogd doesn't seem to keep up with it: [EMAIL PROTECTED]:~# grep -c kblockd_schedule_work /var/log/messages 31 [EMAIL PROTECTED]:~#

Re: proactive raid5 disk replacement success (using bitmap + raid1)

2006-06-22 Thread dean gaudet
for sharing this. I am not quite understand about these 2 commands. Why we want to add a pre-failing disk back to md4? mdadm --zero-superblock /dev/sde1 mdadm /dev/md4 -a /dev/sde1 Ming On Sun, 2006-04-23 at 18:40 -0700, dean gaudet wrote: i had a disk in a raid5 which i wanted to clone

Re: Still can't get md arrays that were started from an initrd to shutdown

2006-07-17 Thread dean gaudet
On Mon, 17 Jul 2006, Christian Pernegger wrote: The problem seems to affect only arrays that are started via an initrd, even if they do not have the root filesystem on them. That's all arrays if they're either managed by EVMS or the ramdisk-creator is initramfs-tools. For yaird-generated

Re: Resize on dirty array?

2006-08-10 Thread dean gaudet
suggestions: - set up smartd to run long self tests once a month. (stagger it every few days so that your disks aren't doing self-tests at the same time) - run 2.6.15 or later so md supports repairing read errors from the other drives... - run 2.6.16 or later so you get the check and

Re: Resize on dirty array?

2006-08-13 Thread dean gaudet
On Fri, 11 Aug 2006, David Rees wrote: On 8/11/06, dean gaudet [EMAIL PROTECTED] wrote: On Fri, 11 Aug 2006, David Rees wrote: On 8/10/06, dean gaudet [EMAIL PROTECTED] wrote: - set up smartd to run long self tests once a month. (stagger it every few days so that your disks

Re: Is mdadm --create safe for existing arrays ?

2006-08-16 Thread dean gaudet
On Wed, 16 Aug 2006, Peter Greis wrote: So, how do I change / and /boot to make the super blocks persistent ? Is it safe to run mdadm --create /dev/md0 --raid-devices=2 --level=1 /dev/sda1 /dev/sdb1 without loosing any data ? boot a rescue disk shrink the filesystems by a few MB to

Re: Feature Request/Suggestion - Drive Linking

2006-08-29 Thread dean gaudet
On Wed, 30 Aug 2006, Neil Bortnak wrote: Hi Everybody, I had this major recovery last week after a hardware failure monkeyed things up pretty badly. About half way though I had a couple of ideas and I thought I'd suggest/ask them. 1) Drive Linking: So let's say I have a 6 disk RAID5

Re: Resize on dirty array?

2006-08-30 Thread dean gaudet
On Sun, 13 Aug 2006, dean gaudet wrote: On Fri, 11 Aug 2006, David Rees wrote: On 8/11/06, dean gaudet [EMAIL PROTECTED] wrote: On Fri, 11 Aug 2006, David Rees wrote: On 8/10/06, dean gaudet [EMAIL PROTECTED] wrote: - set up smartd to run long self tests once a month

Re: RAID-5 recovery

2006-09-03 Thread dean gaudet
On Sun, 3 Sep 2006, Clive Messer wrote: This leads me to a question. I understand from reading the linux-raid archives that the current behaviour when rebuilding with a single badblock on another disk is for that disk to also be kicked from the array. that's not quite the current

Re: Feature Request/Suggestion - Drive Linking

2006-09-05 Thread dean gaudet
On Mon, 4 Sep 2006, Bill Davidsen wrote: But I think most of the logic exists, the hardest part would be deciding what to do. The existing code looks as if it could be hooked to do this far more easily than writing new. In fact, several suggested recovery schemes involve stopping the RAID5,

Re: proactive-raid-disk-replacement

2006-09-08 Thread dean gaudet
On Fri, 8 Sep 2006, Michael Tokarev wrote: Recently Dean Gaudet, in thread titled 'Feature Request/Suggestion - Drive Linking', mentioned his document, http://arctic.org/~dean/proactive-raid5-disk-replacement.txt I've read it, and have some umm.. concerns. Here's why: mdadm -Gb

Re: UUID's

2006-09-08 Thread dean gaudet
On Sat, 9 Sep 2006, Richard Scobie wrote: If I have specified an array in mdadm.conf using UUID's: ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371 and I replace a failed drive in the array, will the new drive be given the previous UUID, or do I need to upate the mdadm.conf

Re: UUID's

2006-09-09 Thread dean gaudet
On Sat, 9 Sep 2006, Richard Scobie wrote: To remove all doubt about what is assembled where, I though going to: DEVICE partitions MAILADDR root ARRAY /dev/md3 UUID=xyz etc. would be more secure. Is this correct thinking on my part? yup. mdadm can generate it all for you... there's

Re: Care and feeding of RAID?

2006-09-09 Thread dean gaudet
On Tue, 5 Sep 2006, Paul Waldo wrote: What about bitmaps? Nobody has mentioned them. It is my understanding that you just turn them on with mdadm /dev/mdX -b internal. Any caveats for this? bitmaps have been working great for me on a raid5 and raid1. it makes it that much more tolerable

Re: access *existing* array from knoppix

2006-09-12 Thread dean gaudet
On Tue, 12 Sep 2006, Dexter Filmore wrote: Am Dienstag, 12. September 2006 16:08 schrieb Justin Piszcz: /dev/MAKEDEV /dev/md0 also make sure the SW raid modules etc are loaded if necessary. Won't work, MAKEDEV doesn't know how to create [/dev/]md0. echo 'DEVICE partitions'

Re: USB and raid... Device names change

2006-09-18 Thread dean gaudet
On Tue, 19 Sep 2006, Eduardo Jacob wrote: DEVICE /dev/raid111 /dev/raid121 ARRAY /dev/md0 level=raid1 num-devices=2 UUID=1369e13f:eb4fa45c:6d4b9c2a:8196aa1b try using DEVICE partitions... then mdadm -As /dev/md0 will scan all available partitions for raid components with

Re: why partition arrays?

2006-10-24 Thread dean gaudet
On Tue, 24 Oct 2006, Bill Davidsen wrote: My read on LVM is that (a) it's one more thing for the admin to learn, (b) because it's seldom used the admin will be working from documentation if it has a problem, and (c) there is no bug-free software, therefore the use of LVM on top of RAID will

Re: md array numbering is messed up

2006-10-30 Thread dean gaudet
On Mon, 30 Oct 2006, Brad Campbell wrote: Michael Tokarev wrote: My guess is that it's using mdrun shell script - the same as on Debian. It's a long story, the thing is quite ugly and messy and does messy things too, but they says it's compatibility stuff and continue shipping it. ...

Re: RAID5/10 chunk size and ext2/3 stride parameter

2006-11-03 Thread dean gaudet
On Tue, 24 Oct 2006, martin f krafft wrote: Hi, I cannot find authoritative information about the relation between the RAID chunk size and the correct stride parameter to use when creating an ext2/3 filesystem. you know, it's interesting -- mkfs.xfs somehow gets the right sunit/swidth

mdadm 2.5.5 external bitmap assemble problem

2006-11-04 Thread dean gaudet
i think i've got my mdadm.conf set properly for an external bitmap -- but it doesn't seem to work. i can assemble from the command-line fine though: # grep md4 /etc/mdadm/mdadm.conf ARRAY /dev/md4 bitmap=/bitmap.md4 UUID=dbc3be0b:b5853930:a02e038c:13ba8cdc # mdadm -A /dev/md4 mdadm: Could not

Re: RAID5/10 chunk size and ext2/3 stride parameter

2006-11-04 Thread dean gaudet
On Sat, 4 Nov 2006, martin f krafft wrote: also sprach dean gaudet [EMAIL PROTECTED] [2006.11.03.2019 +0100]: I cannot find authoritative information about the relation between the RAID chunk size and the correct stride parameter to use when creating an ext2/3 filesystem. you know

Re: Checking individual drive state

2006-11-05 Thread dean gaudet
On Sun, 5 Nov 2006, Bradshaw wrote: I've recently built a smallish RAID5 box as a storage area for my home network, using mdadm. However, one of the drives will not remain in the array for longer that around two days before it is removed. Readding it to the array does not throw any errors,

Re: RAID5 array showing as degraded after motherboard replacement

2006-11-05 Thread dean gaudet
On Sun, 5 Nov 2006, James Lee wrote: Hi there, I'm running a 5-drive software RAID5 array across two controllers. The motherboard in that PC recently died - I sent the board back for RMA. When I refitted the motherboard, connected up all the drives, and booted up I found that the array

Re: Is my RAID broken?

2006-11-06 Thread dean gaudet
On Mon, 6 Nov 2006, Mikael Abrahamsson wrote: On Mon, 6 Nov 2006, Neil Brown wrote: So it looks like you machine recently crashed (power failure?) and it is restarting. Or upgrade some part of the OS and now it'll do resync every week or so (I think this is debian default nowadays,

Re: mdadm 2.5.5 external bitmap assemble problem

2006-11-06 Thread dean gaudet
On Mon, 6 Nov 2006, Neil Brown wrote: hey i have another related question... external bitmaps seem to pose a bit of a chicken-and-egg problem. all of my filesystems are md devices. with an external bitmap i need at least one of the arrays to start, then have filesystems mounted, then

Re: RAID5 array showing as degraded after motherboard replacement

2006-11-06 Thread dean gaudet
On Mon, 6 Nov 2006, James Lee wrote: Thanks for the reply Dean. I looked through dmesg output from the boot up, to check whether this was just an ordering issue during the system start up (since both evms and mdadm attempt to activate the array, which could cause things to go wrong...).

Re: [PATCH 001 of 6] md: Send online/offline uevents when an md array starts/stops.

2006-11-06 Thread dean gaudet
On Mon, 6 Nov 2006, Neil Brown wrote: This creates a deep disconnect between udev and md. udev expects a device to appear first, then it created the device-special-file in /dev. md expect the device-special-file to exist first, and then created the device on the first open. could you create

Re: RAID5 array showing as degraded after motherboard replacement

2006-11-07 Thread dean gaudet
On Wed, 8 Nov 2006, James Lee wrote: However I'm still seeing the error messages in my dmesg (the ones I posted earlier), and they suggest that there is some kind of hardware fault (based on a quick Google of the error codes). So I'm a little confused. the fact that the error is in a

Re: raid5 hang on get_active_stripe

2006-11-15 Thread dean gaudet
and i haven't seen it either... neil do you think your latest patch was hiding the bug? 'cause there was an iteration of an earlier patch which didn't produce much spam in dmesg but the bug was still there, then there is the version below which spams dmesg a fair amount but i didn't see the

Re: safest way to swap in a new physical disk

2006-11-18 Thread dean gaudet
On Tue, 14 Nov 2006, Will Sheffler wrote: Hi. What is the safest way to switch out a disk in a software raid array created with mdadm? I'm not talking about replacing a failed disk, I want to take a healthy disk in the array and swap it for another physical disk. Specifically, I have an

Re: Raid 1 (non) performance

2006-11-19 Thread dean gaudet
On Wed, 15 Nov 2006, Magnus Naeslund(k) wrote: # cat /proc/mdstat Personalities : [raid1] md2 : active raid1 sda3[0] sdb3[1] 236725696 blocks [2/2] [UU] md1 : active raid1 sda2[0] sdb2[1] 4192896 blocks [2/2] [UU] md0 : active raid1 sda1[0] sdb1[1] 4192832 blocks

Re: Observations of a failing disk

2006-11-27 Thread dean gaudet
On Tue, 28 Nov 2006, Richard Scobie wrote: Anyway, my biggest concern is why echo repair /sys/block/md5/md/sync_action appeared to have no effect at all, when I understand that it should re-write unreadable sectors? i've had the same thing happen on a seagate 7200.8 pata 400GB... and

Re: Shrinking a RAID1--superblock problems

2006-12-12 Thread dean gaudet
On Tue, 12 Dec 2006, Jonathan Terhorst wrote: I need to shrink a RAID1 array and am having trouble with the persistent superblock; namely, mdadm --grow doesn't seem to relocate it. If I downsize the array and then shrink the corresponding partitions, the array fails since the superblock

Re: raid5 software vs hardware: parity calculations?

2007-01-12 Thread dean gaudet
On Thu, 11 Jan 2007, James Ralston wrote: I'm having a discussion with a coworker concerning the cost of md's raid5 implementation versus hardware raid5 implementations. Specifically, he states: The performance [of raid5 in hardware] is so much better with the write-back caching on the

Re: raid5 software vs hardware: parity calculations?

2007-01-13 Thread dean gaudet
On Sat, 13 Jan 2007, Robin Bowes wrote: Bill Davidsen wrote: There have been several recent threads on the list regarding software RAID-5 performance. The reference might be updated to reflect the poor write performance of RAID-5 until/unless significant tuning is done. Read that as

Re: raid5 software vs hardware: parity calculations?

2007-01-15 Thread dean gaudet
On Mon, 15 Jan 2007, Robin Bowes wrote: I'm running RAID6 instead of RAID5+1 - I've had a couple of instances where a drive has failed in a RAID5+1 array and a second has failed during the rebuild after the hot-spare had kicked in. if the failures were read errors without losing the entire

Re: raid5 software vs hardware: parity calculations?

2007-01-15 Thread dean gaudet
On Mon, 15 Jan 2007, berk walker wrote: dean gaudet wrote: echo check /sys/block/mdX/md/sync_action it'll read the entire array (parity included) and correct read errors as they're discovered. Could I get a pointer as to how I can do this check in my FC5 [BLAG] system? I can find

Re: raid5 software vs hardware: parity calculations?

2007-01-15 Thread dean gaudet
On Mon, 15 Jan 2007, Mr. James W. Laferriere wrote: Hello Dean , On Mon, 15 Jan 2007, dean gaudet wrote: ...snip... it should just be: echo check /sys/block/mdX/md/sync_action if you don't have a /sys/block/mdX/md/sync_action file then your kernel is too old... or you

Re: bad performance on RAID 5

2007-01-18 Thread dean gaudet
On Wed, 17 Jan 2007, Sevrin Robstad wrote: I'm suffering from bad performance on my RAID5. a echo check /sys/block/md0/md/sync_action gives a speed at only about 5000K/sec , and HIGH load average : # uptime 20:03:55 up 8 days, 19:55, 1 user, load average: 11.70, 4.04, 1.52 iostat

Re: md autodetect only detects one disk in raid1

2007-01-27 Thread dean gaudet
take a look at your mdadm.conf ... both on your root fs and in your initrd... look for a DEVICES line and make sure it says DEVICES partitions... anything else is likely to cause problems like below. also make sure each array is specified by UUID rather than device. and then rebuild your

Re: Reshaping raid0/10

2007-02-21 Thread dean gaudet
On Thu, 22 Feb 2007, Neil Brown wrote: On Wednesday February 21, [EMAIL PROTECTED] wrote: Hello, are there any plans to support reshaping on raid0 and raid10? No concrete plans. It largely depends on time and motivation. I expect that the various flavours of raid5/raid6

Re: Linux Software RAID Bitmap Question

2007-02-28 Thread dean gaudet
On Mon, 26 Feb 2007, Neil Brown wrote: On Sunday February 25, [EMAIL PROTECTED] wrote: I believe Neil stated that using bitmaps does incur a 10% performance penalty. If one's box never (or rarely) crashes, is a bitmap needed? I think I said it can incur such a penalty. The actual cost

Re: Replace drive in RAID5 without losing redundancy?

2007-03-05 Thread dean gaudet
On Tue, 6 Mar 2007, Neil Brown wrote: On Monday March 5, [EMAIL PROTECTED] wrote: Is it possible to mark a disk as to be replaced by an existing spare, then migrate to the spare disk and kick the old disk _after_ migration has been done? Or not even kick - but mark as new spare.

Re: mdadm: raid1 with ext3 - filesystem size differs?

2007-03-20 Thread dean gaudet
it looks like you created the filesystem on the component device before creating the raid. -dean On Fri, 16 Mar 2007, Hanno Meyer-Thurow wrote: Hi all! Please CC me on answers since I am not subscribed to this list, thanks. When I try to build a raid1 system with mdadm 2.6.1 the

Re: XFS sunit/swidth for raid10

2007-03-22 Thread dean gaudet
On Thu, 22 Mar 2007, Peter Rabbitson wrote: dean gaudet wrote: On Thu, 22 Mar 2007, Peter Rabbitson wrote: Hi, How does one determine the XFS sunit and swidth sizes for a software raid10 with 3 copies? mkfs.xfs uses the GET_ARRAY_INFO ioctl to get the data it needs from

Re: XFS sunit/swidth for raid10

2007-03-25 Thread dean gaudet
On Fri, 23 Mar 2007, Peter Rabbitson wrote: dean gaudet wrote: On Thu, 22 Mar 2007, Peter Rabbitson wrote: dean gaudet wrote: On Thu, 22 Mar 2007, Peter Rabbitson wrote: Hi, How does one determine the XFS sunit and swidth sizes for a software raid10 with 3

Re: Raid array is not automatically detected.

2007-07-17 Thread dean gaudet
On Mon, 16 Jul 2007, David Greaves wrote: Bryan Christ wrote: I do have the type set to 0xfd. Others have said that auto-assemble only works on RAID 0 and 1, but just as Justin mentioned, I too have another box with RAID5 that gets auto assembled by the kernel (also no initrd). I

Re: external bitmaps.. and more

2007-12-11 Thread dean gaudet
On Thu, 6 Dec 2007, Michael Tokarev wrote: I come across a situation where external MD bitmaps aren't usable on any standard linux distribution unless special (non-trivial) actions are taken. First is a small buglet in mdadm, or two. It's not possible to specify --bitmap= in assemble

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-27 Thread dean gaudet
hmm this seems more serious... i just ran into it with chunksize 64KiB and while just untarring a bunch of linux kernels in parallel... increasing stripe_cache_size did the trick again. -dean On Thu, 27 Dec 2007, dean gaudet wrote: hey neil -- remember that raid5 hang which me and only one

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-27 Thread dean gaudet
... in this case it's with a workload which is untarring 34 copies of the linux kernel at the same time. it's a variant of doug ledford's memtest, and i've attached it. -dean#!/usr/bin/perl # Copyright (c) 2007 dean gaudet [EMAIL PROTECTED] # # Permission is hereby granted, free of charge, to any

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-29 Thread dean gaudet
to (chunk_size * raid_disks * stripe_cache_size) or (chunk_size * raid_disks * stripe_cache_active)? -dean On Thu, 27 Dec 2007, dean gaudet wrote: hmm this seems more serious... i just ran into it with chunksize 64KiB and while just untarring a bunch of linux kernels in parallel... increasing

Re: Linux RAID Partition Offset 63 cylinders / 30% performance hit?

2007-12-29 Thread dean gaudet
On Tue, 25 Dec 2007, Bill Davidsen wrote: The issue I'm thinking about is hardware sector size, which on modern drives may be larger than 512b and therefore entail a read-alter-rewrite (RAR) cycle when writing a 512b block. i'm not sure any shipping SATA disks have larger than 512B sectors

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-29 Thread dean gaudet
On Sat, 29 Dec 2007, Dan Williams wrote: On Dec 29, 2007 9:48 AM, dean gaudet [EMAIL PROTECTED] wrote: hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on the same 64k chunk array and had raised the stripe_cache_size to 1024... and got a hang. this time i grabbed

[patch] improve stripe_cache_size documentation

2007-12-29 Thread dean gaudet
Document the amount of memory used by the stripe cache and the fact that it's tied down and unavailable for other purposes (right?). thanks to Dan Williams for the formula. -dean Signed-off-by: dean gaudet [EMAIL PROTECTED] Index: linux/Documentation/md.txt

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-29 Thread dean gaudet
On Sat, 29 Dec 2007, Justin Piszcz wrote: Curious btw what kind of filesystem size/raid type (5, but defaults I assume, nothing special right? (right-symmetric vs. left-symmetric, etc?)/cache size/chunk size(s) are you using/testing with? mdadm --create --level=5 --chunk=64 -n7 -x1 /dev/md2

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-29 Thread dean gaudet
On Sat, 29 Dec 2007, dean gaudet wrote: On Sat, 29 Dec 2007, Justin Piszcz wrote: Curious btw what kind of filesystem size/raid type (5, but defaults I assume, nothing special right? (right-symmetric vs. left-symmetric, etc?)/cache size/chunk size(s) are you using/testing

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-30 Thread dean gaudet
On Sat, 29 Dec 2007, Dan Williams wrote: On Dec 29, 2007 1:58 PM, dean gaudet [EMAIL PROTECTED] wrote: On Sat, 29 Dec 2007, Dan Williams wrote: On Dec 29, 2007 9:48 AM, dean gaudet [EMAIL PROTECTED] wrote: hmm bummer, i'm doing another test (rsync 3.5M inodes from another box

Re: [patch] improve stripe_cache_size documentation

2007-12-30 Thread dean gaudet
On Sun, 30 Dec 2007, Thiemo Nagel wrote: stripe_cache_size (currently raid5 only) As far as I have understood, it applies to raid6, too. good point... and raid4. here's an updated patch. -dean Signed-off-by: dean gaudet [EMAIL PROTECTED] Index: linux/Documentation/md.txt

Re: [patch] improve stripe_cache_size documentation

2007-12-30 Thread dean gaudet
On Sun, 30 Dec 2007, dean gaudet wrote: On Sun, 30 Dec 2007, Thiemo Nagel wrote: stripe_cache_size (currently raid5 only) As far as I have understood, it applies to raid6, too. good point... and raid4. here's an updated patch. and once again with a typo fix. oops. -dean

Re: Raid 1, can't get the second disk added back in.

2008-01-09 Thread dean gaudet
On Tue, 8 Jan 2008, Bill Davidsen wrote: Neil Brown wrote: On Monday January 7, [EMAIL PROTECTED] wrote: Problem is not raid, or at least not obviously raid related. The problem is that the whole disk, /dev/hdb is unavailable. Maybe check /sys/block/hdb/holders ? lsof

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread dean gaudet
On Thu, 10 Jan 2008, Neil Brown wrote: On Wednesday January 9, [EMAIL PROTECTED] wrote: On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote: i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread dean gaudet
On Fri, 11 Jan 2008, Neil Brown wrote: Thanks. But I suspect you didn't test it with a bitmap :-) I ran the mdadm test suite and it hit a problem - easy enough to fix. damn -- i lost my bitmap 'cause it was external and i didn't have things set up properly to pick it up after a reboot :) if

Re: [PATCH 001 of 6] md: Fix an occasional deadlock in raid5

2008-01-15 Thread dean gaudet
the performance numbers. Calling it in raid5d was sometimes too soon... Cc: Dan Williams [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] probably doesn't matter, but for the record: Tested-by: dean gaudet [EMAIL PROTECTED] this time i tested with internal and external bitmaps

Re: [PATCH 001 of 6] md: Fix an occasional deadlock in raid5

2008-01-15 Thread dean gaudet
On Tue, 15 Jan 2008, Andrew Morton wrote: On Tue, 15 Jan 2008 21:01:17 -0800 (PST) dean gaudet [EMAIL PROTECTED] wrote: On Mon, 14 Jan 2008, NeilBrown wrote: raid5's 'make_request' function calls generic_make_request on underlying devices and if we run out of stripe heads

  1   2   >