Re: Superblocks
Greg Cormier wrote: Any reason 0.9 is the default? Should I be worried about using 1.0 superblocks? And can I upgrade my array from 0.9 to 1.0 superblocks? Do understand that Neil may have other reasons... but mainly the 0.9 format is the default because it is most widely supported and allows you to use new mdadm versions on old distributions (I still have one FC1 machine!). As for changing metadata on an existing array, I really can't offer any help. Thanks, Greg On 11/1/07, Neil Brown [EMAIL PROTECTED] wrote: On Tuesday October 30, [EMAIL PROTECTED] wrote: Which is the default type of superblock? 0.90 or 1.0? The default default is 0.90. However a local device can be set in mdadm.conf with e.g. CREATE metdata=1.0 NeilBrown -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Superblocks
Neil Brown wrote: On Tuesday October 30, [EMAIL PROTECTED] wrote: Which is the default type of superblock? 0.90 or 1.0? The default default is 0.90. However a local device can be set in mdadm.conf with e.g. CREATE metdata=1.0 If you change to 1.start, 1.ed, 1.4k names for clarity, they need to be accepted here, as well. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Superblocks
Any reason 0.9 is the default? Should I be worried about using 1.0 superblocks? And can I upgrade my array from 0.9 to 1.0 superblocks? Thanks, Greg On 11/1/07, Neil Brown [EMAIL PROTECTED] wrote: On Tuesday October 30, [EMAIL PROTECTED] wrote: Which is the default type of superblock? 0.90 or 1.0? The default default is 0.90. However a local device can be set in mdadm.conf with e.g. CREATE metdata=1.0 NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Superblocks
On Tuesday October 30, [EMAIL PROTECTED] wrote: Which is the default type of superblock? 0.90 or 1.0? The default default is 0.90. However a local device can be set in mdadm.conf with e.g. CREATE metdata=1.0 NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Superblocks
Which is the default type of superblock? 0.90 or 1.0? On 10/30/07, Neil Brown [EMAIL PROTECTED] wrote: On Friday October 26, [EMAIL PROTECTED] wrote: Can someone help me understand superblocks and MD a little bit? I've got a raid5 array with 3 disks - sdb1, sdc1, sdd1. --examine on these 3 drives shows correct information. However, if I also examine the raw disk devices, sdb and sdd, they also appear to have superblocks with some semi valid looking information. sdc has no superblock. If a partition starts a multiple of 64K from the start of the device, and ends with about 64K of the end of the device, then a superblock on the partition will also look like a superblock on the whole device. This is one of the shortcomings of v0.90 superblocks. v1.0 doesn't have this problem. How can I clear these? If I unmount my raid, stop md0, it won't clear it. mdadm --zero-superblock device name is the best way to remove an unwanted superblock. Ofcourse in the above described case, removing the unwanted superblock will remove the wanted one aswell. [EMAIL PROTECTED] ~]# mdadm --zero-superblock /dev/hdd mdadm: Couldn't open /dev/hdd for write - not zeroing As I think someone else pointed out /dev/hdd is not /dev/sdd. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Superblocks
On Friday October 26, [EMAIL PROTECTED] wrote: Can someone help me understand superblocks and MD a little bit? I've got a raid5 array with 3 disks - sdb1, sdc1, sdd1. --examine on these 3 drives shows correct information. However, if I also examine the raw disk devices, sdb and sdd, they also appear to have superblocks with some semi valid looking information. sdc has no superblock. If a partition starts a multiple of 64K from the start of the device, and ends with about 64K of the end of the device, then a superblock on the partition will also look like a superblock on the whole device. This is one of the shortcomings of v0.90 superblocks. v1.0 doesn't have this problem. How can I clear these? If I unmount my raid, stop md0, it won't clear it. mdadm --zero-superblock device name is the best way to remove an unwanted superblock. Ofcourse in the above described case, removing the unwanted superblock will remove the wanted one aswell. [EMAIL PROTECTED] ~]# mdadm --zero-superblock /dev/hdd mdadm: Couldn't open /dev/hdd for write - not zeroing As I think someone else pointed out /dev/hdd is not /dev/sdd. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Superblocks
why you zeroing hdd ? should you clear sdd? On 10/26/07, Greg Cormier [EMAIL PROTECTED] wrote: Can someone help me understand superblocks and MD a little bit? I've got a raid5 array with 3 disks - sdb1, sdc1, sdd1. --examine on these 3 drives shows correct information. However, if I also examine the raw disk devices, sdb and sdd, they also appear to have superblocks with some semi valid looking information. sdc has no superblock. How can I clear these? If I unmount my raid, stop md0, it won't clear it. [EMAIL PROTECTED] ~]# mdadm --zero-superblock /dev/hdd mdadm: Couldn't open /dev/hdd for write - not zeroing I'd like to rule out these oddities before I start on my next troubleshooting of why my array rebuilds every time I reboot :) Thanks, Greg - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Raz - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Superblocks
Can someone help me understand superblocks and MD a little bit? I've got a raid5 array with 3 disks - sdb1, sdc1, sdd1. --examine on these 3 drives shows correct information. However, if I also examine the raw disk devices, sdb and sdd, they also appear to have superblocks with some semi valid looking information. sdc has no superblock. How can I clear these? If I unmount my raid, stop md0, it won't clear it. [EMAIL PROTECTED] ~]# mdadm --zero-superblock /dev/hdd mdadm: Couldn't open /dev/hdd for write - not zeroing I'd like to rule out these oddities before I start on my next troubleshooting of why my array rebuilds every time I reboot :) Thanks, Greg - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] lvm2 support for detecting v1.x MD superblocks
lvm2's MD v1.0 superblock detection doesn't work at all (because it doesn't use v1 sb offsets). I've tested the attached patch to work on MDs with v0.90.0, v1.0, v1.1, and v1.2 superblocks. please advise, thanks. Mike Index: lib/device/dev-md.c === RCS file: /cvs/lvm2/LVM2/lib/device/dev-md.c,v retrieving revision 1.5 diff -u -r1.5 dev-md.c --- lib/device/dev-md.c 20 Aug 2007 20:55:25 - 1.5 +++ lib/device/dev-md.c 23 Oct 2007 15:17:57 - @@ -25,6 +25,40 @@ #define MD_NEW_SIZE_SECTORS(x) ((x ~(MD_RESERVED_SECTORS - 1)) \ - MD_RESERVED_SECTORS) +int dev_has_md_sb(struct device *dev, uint64_t sb_offset, uint64_t *sb) +{ + int ret = 0; + uint32_t md_magic; + /* Version 1 is little endian; version 0.90.0 is machine endian */ + if (dev_read(dev, sb_offset, sizeof(uint32_t), md_magic) + ((md_magic == xlate32(MD_SB_MAGIC)) || + (md_magic == MD_SB_MAGIC))) { + if (sb) + *sb = sb_offset; + ret = 1; + } + return ret; +} + +uint64_t v1_sb_offset(uint64_t size, int minor_version) { + uint64_t sb_offset; + switch(minor_version) { + case 0: + sb_offset = size; + sb_offset -= 8*2; + sb_offset = ~(4*2-1); + break; + case 1: + sb_offset = 0; + break; + case 2: + sb_offset = 4*2; + break; + } + sb_offset = SECTOR_SHIFT; + return sb_offset; +} + /* * Returns -1 on error */ @@ -35,7 +69,6 @@ #ifdef linux uint64_t size, sb_offset; - uint32_t md_magic; if (!dev_get_size(dev, size)) { stack; @@ -50,16 +83,20 @@ return -1; } - sb_offset = MD_NEW_SIZE_SECTORS(size) SECTOR_SHIFT; - /* Check if it is an md component device. */ - /* Version 1 is little endian; version 0.90.0 is machine endian */ - if (dev_read(dev, sb_offset, sizeof(uint32_t), md_magic) - ((md_magic == xlate32(MD_SB_MAGIC)) || - (md_magic == MD_SB_MAGIC))) { - if (sb) - *sb = sb_offset; + /* Version 0.90.0 */ + sb_offset = MD_NEW_SIZE_SECTORS(size) SECTOR_SHIFT; + if (dev_has_md_sb(dev, sb_offset, sb)) { ret = 1; + } else { + /* Version 1, try v1.0 - v1.2 */ + int minor; + for (minor = 0; minor = 2; minor++) { + if (dev_has_md_sb(dev, v1_sb_offset(size, minor), sb)) { +ret = 1; +break; + } + } } if (!dev_close(dev))
Re: [lvm-devel] [PATCH] lvm2 support for detecting v1.x MD superblocks
On Tue, Oct 23, 2007 at 11:32:56AM -0400, Mike Snitzer wrote: I've tested the attached patch to work on MDs with v0.90.0, v1.0, v1.1, and v1.2 superblocks. I'll apply this, thanks, but need to add comments (or reference) to explain what the hard-coded numbers are: sb_offset = (size - 8 * 2) ~(4 * 2 - 1); etc. Alasdair -- [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [lvm-devel] [PATCH] lvm2 support for detecting v1.x MD superblocks
On 10/23/07, Alasdair G Kergon [EMAIL PROTECTED] wrote: On Tue, Oct 23, 2007 at 11:32:56AM -0400, Mike Snitzer wrote: I've tested the attached patch to work on MDs with v0.90.0, v1.0, v1.1, and v1.2 superblocks. I'll apply this, thanks, but need to add comments (or reference) to explain what the hard-coded numbers are: sb_offset = (size - 8 * 2) ~(4 * 2 - 1); etc. All values are in terms of sectors; so that is where the * 2 is coming from. The v1.0 case follows the same model as the MD_NEW_SIZE_SECTORS which is used for v0.90.0. The difference is that the v1.0 superblock is found at least 8K, but less than 12K, from the end of the device. The same switch statement is used in mdadm and is accompanied with the following comment: /* * Calculate the position of the superblock. * It is always aligned to a 4K boundary and * depending on minor_version, it can be: * 0: At least 8K, but less than 12K, from end of device * 1: At start of device * 2: 4K from start of device. */ Would it be sufficient to add that comment block above v1_sb_offset()'s switch statement? thanks, Mike - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm 2.6.3 segfaults on assembly (v1 superblocks)
also sprach Neil Brown [EMAIL PROTECTED] [2007.09.24.0528 +0100]: Sure could. Thanks for the report. This patch (already in .git) should fix it. Apparently it does not, and it seems to be amd64-only since I saw it on amd64 and a bunch of people reported success on i386: http://bugs.debian.org/444682 Any help appreciated. I don't have an amd64 system around for another three weeks... -- martin; (greetings from the heart of the sun.) \ echo mailto: !#^.*|tr * mailto:; [EMAIL PROTECTED] scientists will study your brain to learn more about your distant cousin, man. spamtraps: [EMAIL PROTECTED] digital_signature_gpg.asc Description: Digital signature (see http://martin-krafft.net/gpg/)
[PATCH] Fix segfault on assembly on amd64 with v1 superblocks
Commit a40b4fe introduced a temporary supertype variable tst, instead of manipulating st directly. However, it was forgotton to pass tst into the recursive load_super1 call, causing an infinite recursion. Signed-off-by: martin f. krafft [EMAIL PROTECTED] --- super1.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/super1.c b/super1.c index 52783e7..06c2655 100644 --- a/super1.c +++ b/super1.c @@ -1001,7 +1001,7 @@ static int load_super1(struct supertype *st, int fd, void **sbp, char *devname) /* guess... choose latest ctime */ tst.ss = super1; for (tst.minor_version = 0; tst.minor_version = 2 ; tst.minor_version++) { - switch(load_super1(st, fd, sbp, devname)) { + switch(load_super1(tst, fd, sbp, devname)) { case 0: super = *sbp; if (bestvers == -1 || bestctime __le64_to_cpu(super-ctime)) { -- 1.5.3.1 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug#444682: [PATCH] Fix segfault on assembly on amd64 with v1 superblocks
I've tested this patch and it works :) Daniel On Sun, 2007-09-30 at 13:22 +0100, martin f. krafft wrote: Commit a40b4fe introduced a temporary supertype variable tst, instead of manipulating st directly. However, it was forgotton to pass tst into the recursive load_super1 call, causing an infinite recursion. Signed-off-by: martin f. krafft [EMAIL PROTECTED] --- super1.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/super1.c b/super1.c index 52783e7..06c2655 100644 --- a/super1.c +++ b/super1.c @@ -1001,7 +1001,7 @@ static int load_super1(struct supertype *st, int fd, void **sbp, char *devname) /* guess... choose latest ctime */ tst.ss = super1; for (tst.minor_version = 0; tst.minor_version = 2 ; tst.minor_version++) { - switch(load_super1(st, fd, sbp, devname)) { + switch(load_super1(tst, fd, sbp, devname)) { case 0: super = *sbp; if (bestvers == -1 || bestctime __le64_to_cpu(super-ctime)) { - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm 2.6.3 segfaults on assembly (v1 superblocks)
On Friday September 7, [EMAIL PROTECTED] wrote: Neil, could this be a bug? Sure could. Thanks for the report. This patch (already in .git) should fix it. NeilBrown --- Don't corrupt 'supertype' when speculatively calling load_super1 When load_super1 is trying to see which sub-version of v1 superblock is present, failure will cause it to clear st-ss, which is not good. So use a temporary 'super_type' for the 'test if this version works' calls, then copy that into 'st' on success. ### Diffstat output ./super1.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff .prev/super1.c ./super1.c --- .prev/super1.c 2007-09-24 14:26:19.0 +1000 +++ ./super1.c 2007-09-24 14:23:11.0 +1000 @@ -996,34 +996,35 @@ static int load_super1(struct supertype if (st-ss == NULL || st-minor_version == -1) { int bestvers = -1; + struct supertype tst; __u64 bestctime = 0; /* guess... choose latest ctime */ - st-ss = super1; - for (st-minor_version = 0; st-minor_version = 2 ; st-minor_version++) { + tst.ss = super1; + for (tst.minor_version = 0; tst.minor_version = 2 ; tst.minor_version++) { switch(load_super1(st, fd, sbp, devname)) { case 0: super = *sbp; if (bestvers == -1 || bestctime __le64_to_cpu(super-ctime)) { - bestvers = st-minor_version; + bestvers = tst.minor_version; bestctime = __le64_to_cpu(super-ctime); } free(super); *sbp = NULL; break; - case 1: st-ss = NULL; return 1; /*bad device */ + case 1: return 1; /*bad device */ case 2: break; /* bad, try next */ } } if (bestvers != -1) { int rv; - st-minor_version = bestvers; - st-ss = super1; - st-max_devs = 384; + tst.minor_version = bestvers; + tst.ss = super1; + tst.max_devs = 384; rv = load_super1(st, fd, sbp, devname); - if (rv) st-ss = NULL; + if (rv == 0) + *st = tst; return rv; } - st-ss = NULL; return 2; } if (!get_dev_size(fd, devname, dsize)) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
mdadm 2.6.3 segfaults on assembly (v1 superblocks)
Hi, preparing the Debian package for mdadm 2.6.3, I found a segfault in mdadm/Assemble.c:254, in the line: } else if (tst-ss-load_super(tst,dfd, super, NULL)) { the problem is that tst-ss is NULL, due to reasons I have not yet uncovered. The segfault happens only in the second iteration of the for loop at line 212 and the load_super1 call, caused by the above load_super in the first iteration, causes tst-ss to be set to NULL. This happens in the first recursion (load_super1 calls itself), at which point the if (dsize 24) { check in super1.c:1033 fails and thus returns 1, which causes the outer load_super1 function to return 1 after setting st-ss to NULL in line super1.c:1013. This all happens while the dfd variable in Assemble.c:254 has value 8, and assuming this is a file descriptor, then lsof says: mdadm 25664 root8r BLK 22,3 2806 /dev/hdc3 /dev/hdc3 is an extended partition on the disk. /dev/hdc1 * 1 8 64228+ 83 Linux /dev/hdc2 9 132 996030 82 Linux swap / Solaris /dev/hdc3 133 30401 243135742+ 5 Extended /dev/hdc5 133 256 995998+ 83 Linux /dev/hdc6 257 505 261 83 Linux /dev/hdc7 506 28347 223640833+ 83 Linux /dev/hdc8 28348 3033916000708+ 83 Linux /dev/hdc9 30340 30401 497983+ 83 Linux I am failing to reproduce this on v0.9 superblock systems. Neil, could this be a bug? -- martin; (greetings from the heart of the sun.) \ echo mailto: !#^.*|tr * mailto:; [EMAIL PROTECTED] nothing can cure the soul but the senses, just as nothing can cure the senses but the soul. -- oscar wilde spamtraps: [EMAIL PROTECTED] digital_signature_gpg.asc Description: Digital signature (see http://martin-krafft.net/gpg/)
mdadm: different component-count in superblocks prevents assembly
Hello Again :) Having a component device with slightly different superblock characteristics in the system prevents mdadm from assembling arrays. For example: mdadm --fail /dev/mdx /dev/xdx mdadm -G -n y-1 /dev/mdx would lead to a non-assemble-able /dev/mdx as long as /dev/xdx remains in the system and thus probably leads to an unbootable system should /dev/mdx be /. Wouldn't it make sense to weight other superblock parameters like the event counter a little higher, i.e. under all unique-id'ed devices consider first only those with the highest equal event count and only afterwards compare the rest of the superblocks (and then fail, if they don't match, of course)? regards Mario -- Goethe war nicht gerne Minister. Er beschaeftigte sich lieber geistig. -- Lukasburger Stilblueten - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: conflicting superblocks - Re: what is the best approach for fixing a degraded RAID5 (one drive failed) using mdadm?
On Tuesday June 12, [EMAIL PROTECTED] wrote: Can anyone please advise which commands we should use to get the array back to at least a read only state? mdadm --assemble /dev/md0 /dev/sd[abcd]2 and let mdadm figure it out. It is good at that. If the above doesn't work, add --force, but be aware that there is some possibility of hidden data corruption. At least a fsck would be advised. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
problems with faulty disks and superblocks 1.0, 1.1 and 1.2
Hello I'm having problems with a RAID-1 configuration. I cannot re-add a disk that I've failed, because each time I do this, the re-added disk is still seen as failed. After some investigations, I found that this problem only occur when I create the RAID array with superblocks 1.0, 1.1 and 1.2. With the superblock 0.90 I don't encounter this issue. Here are the commands to easily reproduce the issue mdadm -C /dev/md_d0 -e 1.0 -l 1 -n 2 -b internal -R /dev/sda /dev/sdb mdadm /dev/md_d0 -f /dev/sda mdadm /dev/md_d0 -r /dev/sda mdadm /dev/md_d0 -a /dev/sda cat /proc/mdstat The output of mdstat is: Personalities : [raid1] md_d0 : active raid1 sda[0](F) sdb[1] 104849 blocks super 1.2 [2/1] [_U] bitmap: 0/7 pages [0KB], 8KB chunk unused devices: none I'm wondering if the way I'm failing and re-adding a disk is correct. Did I make something wrong? If I change the superblock to -e 0.90, there's no problem with this set of commands. For now, I found a work-around with superblock 1.0 which consists in zeroing the superblock before re-adding the disk. But I suppose that doing so will force a full rebuild of the re-added disk, and I don't want this, because I'm using write-intent bitmaps. I'm using mdadm - v2.5.6 on Debian Etch with kernel 2.6.18-4. Bug or misunderstanding from myself? Any help would be appreciated :) Thanks Hubert - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moron Destroyed RAID6 Array Superblocks
If the array consisted of 7 drives, two of them would be parity and I'd need to leave them out of the array in order to avoid resyncing. Would I need to specify them as 'missing' so the drives are in the correct order? For example mdadm --raid-devices=7 --level=6 --foo /dev/md0 /dev/sda1 /dev/sdc1 missing /dev/sdb1 /dev/sdg1 missing /dev/sdh1 Or alternatively with the --assume-clean switch, should I specify all 7 drives and see if it mounts? -A On Sun, 2007-04-08 at 08:21 -0700, Andrew Burgess wrote: After reading through the linux-raid archives I have been lead to believe that I can recover my array by doing a --create and listing the 7 drives in the exact order I originally created them in. Is this correct? If so, the kernel upgrade managed to shuffle the drive names around...is there any way I can figure out what order they should be in? If you know the old order just boot the old kernel and recreate there If you are using kernel rpms, you always want to 'install' new kernels and not 'upgrade', that keeps both the old and new kernel available. Otherwise, if you pick just 5 for the 7 drives from a raid 6 array it seems to me you have 5 factorial combinations = 120. You could write a script that tries them all and then tests by trying a read-only mount? Good luck. I mashed a 3.5 TB array once by rerunning the lvm creation which zeros the first chunk of the resulting device. I was used to mdadm not hurting user data when creating. Grrr. BTW when you test recreate the array make sure you dont start syncing, thats another reason to just use 5 drives. HTH - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moron Destroyed RAID6 Array Superblocks
Aaron C. de Bruyn wrote: Ok--I got moved in to my new place and am back and running on the 'net. I sat down for a few hours and attempted to write a script to try all possible combinations of drives...but I have to admit that I'm lost. I have 8 drives in the array--and I can output every possible combination of those. But what the heck would be the logic to output all combinations of the 8 drives using only 6 at a time? My head hurts. You want only 5 at a time, don't you? 8 drives - 1 spare - 2 parity = 5 Anyway, a very quick-n-dirty way to get what you want is to: 1. calculate all permutations 2. strip away the last items of each permutation 3. get rid of duplicate lines $ wget http://hayne.net/MacDev/Perl/permutations $ perl permutations a b c d e f g h | cut -d' ' -f-5 | sort -u The Perl script above is from: http://hayne.net/MacDev/Perl/ -Corey - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Manually hacking superblocks
You will need a --create This fixed the issue. So far, at least, I couldn't find any data corruption either :) Thank you. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Manually hacking superblocks
I managed to mess up a RAID-5 array by mdadm -adding a few failed disks back, trying to get the array running again. Unfortunately, -add didn't do what I expected, but instead made spares out of the failed disks. The disks failed due to loose SATA cabling and the data inside should be fairly consistent. sdh failed a bit earlier than sdd and sde, so I expect to be able to revocer by building a degraded array without sdh and then syncing. The current situation looks like this: Number Major Minor RaidDevice State 0 0 8 330 active sync /dev/sdc1 1 1 001 faulty removed 2 2 8 972 active sync /dev/sdg1 3 3 8 1293 active sync /dev/sdi1 4 4 004 faulty removed 5 5 8 815 active sync /dev/sdf1 6 6 006 faulty removed 7 7 8 1777 spare 8 8 8 1618 spare 9 9 8 1459 spare ... and before any of this happened, the configuration was: disk 0, o:1, dev:sdc1 disk 1, o:1, dev:sde1 disk 2, o:1, dev:sdg1 disk 3, o:1, dev:sdi1 disk 4, o:1, dev:sdh1 disk 5, o:1, dev:sdf1 disk 6, o:1, dev:sdd1 I gather that I need a way to alter the superblocks of sde and sdd so that they seem to be clean up-to-date disks, with their original disk numbers 1 and 6. A hex editor comes to mind, but are there any better tools for that? - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Manually hacking superblocks
Lasse Kärkkäinen wrote: I managed to mess up a RAID-5 array by mdadm -adding a few failed disks back, trying to get the array running again. Unfortunately, -add didn't do what I expected, but instead made spares out of the failed disks. The disks failed due to loose SATA cabling and the data inside should be fairly consistent. sdh failed a bit earlier than sdd and sde, so I expect to be able to revocer by building a degraded array without sdh and then syncing. The current situation looks like this: Number Major Minor RaidDevice State 0 0 8 330 active sync /dev/sdc1 1 1 001 faulty removed 2 2 8 972 active sync /dev/sdg1 3 3 8 1293 active sync /dev/sdi1 4 4 004 faulty removed 5 5 8 815 active sync /dev/sdf1 6 6 006 faulty removed 7 7 8 1777 spare 8 8 8 1618 spare 9 9 8 1459 spare ... and before any of this happened, the configuration was: disk 0, o:1, dev:sdc1 disk 1, o:1, dev:sde1 disk 2, o:1, dev:sdg1 disk 3, o:1, dev:sdi1 disk 4, o:1, dev:sdh1 disk 5, o:1, dev:sdf1 disk 6, o:1, dev:sdd1 I gather that I need a way to alter the superblocks of sde and sdd so that they seem to be clean up-to-date disks, with their original disk numbers 1 and 6. A hex editor comes to mind, but are there any better tools for that? You don't need a tool. mdadm --force will do what you want. Read the archives and the man page. You are correct to assemble the array with a missing disk (or 2 missing disks for RAID6) - this prevents the kernel from trying to sync. Not syncing is good because if you do make a slight error in the order, you can end up syncing bad data over good. I *THINK* you should try something like (untested): mdadm --assemble /dev/md0 --force /dev/sdc1 /dev/sde1 /dev/sdg1 /dev/sdi1 missing /dev/sdf1 /dev/sdf1 The order is important and should match the original order. There's more you could do by looking at device event counts (--examine) Also you must do a READ-ONLY mount the first time you mount the array - this will check the consistency and avoid corruption if you get the order wrong. I really must get around to setting up a test environment so I can check this out and update the wiki... I have to go out or a couple of hours. Let me know how it goes if you can't wait for me to get back. David - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Manually hacking superblocks
On Friday April 13, [EMAIL PROTECTED] wrote: Lasse Kärkkäinen wrote: disk 0, o:1, dev:sdc1 disk 1, o:1, dev:sde1 disk 2, o:1, dev:sdg1 disk 3, o:1, dev:sdi1 disk 4, o:1, dev:sdh1 disk 5, o:1, dev:sdf1 disk 6, o:1, dev:sdd1 I gather that I need a way to alter the superblocks of sde and sdd so that they seem to be clean up-to-date disks, with their original disk numbers 1 and 6. A hex editor comes to mind, but are there any better tools for that? .. I *THINK* you should try something like (untested): mdadm --assemble /dev/md0 --force /dev/sdc1 /dev/sde1 /dev/sdg1 /dev/sdi1 missing /dev/sdf1 /dev/sdf1 The order is important and should match the original order. There's more you could do by looking at device event counts (--examine) You will need a --create --assemble ignores the order in which the devices are given. It uses the information on the drives. Once you do a --add, you lose that information. It is good that you know the original order. You --examine the confirm the chunk size or any other details you might not be sure of and recreate the array. Leave one disk (the least likely to be uptodate) as 'missing' and then try 'fsck -n' to ensure the data is ok. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID5 superblocks partly messed up after degradation
Neil Brown wrote: I'll see what I can do :-) The problem could be resolved by removing one of the two external SATA controllers (PCI card with ALI M5283) and using Kernel 2.6.20.6 Only removing the ALI PCI card brought the numbering scheme in line again so the old (degraded) array became accessible again. Even with no disks attached to it, the kernel did not get its disk naming in shape to assemble more than one (sdd) of the four old devices although all 4 devices could be accessed with mdadm --examine or fdisk. Additionally, using 2.6.20.6 resolved the ghost device issue where one SATA drive appeared additionally as a PATA drive, too. Now I could create the new array and copy over all data from the degraded one. wipes sweat Are there any kernel logs that this time which might make it clear what is happening? Not really, the logs contain quite a mess of trying different kernels and system configurations to get the data back with the disc naming following the state of the moon. I produced too much sweat to get the data back so I am reluctant to try anything now that it works until my backup scheme is somewhat improved :) Thank you for your help. Frank - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moron Destroyed RAID6 Array Superblocks
Ok--I got moved in to my new place and am back and running on the 'net. I sat down for a few hours and attempted to write a script to try all possible combinations of drives...but I have to admit that I'm lost. I have 8 drives in the array--and I can output every possible combination of those. But what the heck would be the logic to output all combinations of the 8 drives using only 6 at a time? My head hurts. -A On Sun, 2007-04-08 at 08:21 -0700, Andrew Burgess wrote: After reading through the linux-raid archives I have been lead to believe that I can recover my array by doing a --create and listing the 7 drives in the exact order I originally created them in. Is this correct? If so, the kernel upgrade managed to shuffle the drive names around...is there any way I can figure out what order they should be in? If you know the old order just boot the old kernel and recreate there If you are using kernel rpms, you always want to 'install' new kernels and not 'upgrade', that keeps both the old and new kernel available. Otherwise, if you pick just 5 for the 7 drives from a raid 6 array it seems to me you have 5 factorial combinations = 120. You could write a script that tries them all and then tests by trying a read-only mount? Good luck. I mashed a 3.5 TB array once by rerunning the lvm creation which zeros the first chunk of the resulting device. I was used to mdadm not hurting user data when creating. Grrr. BTW when you test recreate the array make sure you dont start syncing, thats another reason to just use 5 drives. HTH - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID5 superblocks partly messed up after degradation
On Monday April 9, [EMAIL PROTECTED] wrote: Hello, hopefully someone can help me. I'll see what I can do :-) The 4 x 300 RAID can not be assembled anymore. mdadm --assemble --verbose --no-degraded /dev/md5 /dev/hdc1 /dev/sdb1 /dev/sdc1 /dev/sdd1 mdadm: looking for devices for /dev/md5 mdadm: /dev/hdc1 is identified as a member of /dev/md5, slot 2. mdadm: /dev/sdb1 is identified as a member of /dev/md5, slot 3. mdadm: /dev/sdc1 is identified as a member of /dev/md5, slot 0. mdadm: /dev/sdd1 is identified as a member of /dev/md5, slot 1. mdadm: added /dev/sdd1 to /dev/md5 as 1 mdadm: failed to add /dev/hdc1 to /dev/md5: Invalid argument mdadm: failed to add /dev/sdb1 to /dev/md5: Invalid argument mdadm: failed to add /dev/sdc1 to /dev/md5: Invalid argument mdadm: /dev/md5 assembled from 0 drives (out of 4), but not started. That is a little odd. Looking at the 'Event' number on the devices as given below, sdd1 is way behind all the others, and so mdadm should not be including it... and even if it is, the kernel should simply let the one with the higher event count over-ride. Are there any kernel logs that this time which might make it clear what is happening? Detaching sdb (the faulty disk) does not make any difference. sdd1 seems to be the problem. Can you run the above 'mdadm' command but without listing /dev/sdd1 ? /dev/sdb1: Number Major Minor RaidDevice State this 3 8 493 active sync /dev/sdd1 - what's this? why SDD? This is SDB! also, this is the faulty device! /dev/sdd1 is the device with major/minor numbers 8,49. The last time the array was assembled, the device at slot '3' had major/minor numbers 8,49. It is telling you what the situation used to be, not what it is now. Just ignore it... NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RAID5 superblocks partly messed up after degradation
Hello, hopefully someone can help me. MD RAID 5 with 4 disks (3x SATA, 1x PATA) of 300 GB each; Kernel 2.6.19.5, openSUSE 10.2 One of the SATA disks had incoverable read errors when copying data so the disk was marked bad by MD and the array as degraded. As I had no spare but the array was supposed to be upgraded to 4 x 500 GB anyway I chose to install two additional SATA controllers, attach 4 SATA 500 GB disks to them and create a new MD RAID 5 on these (using openSUSE 10.2 (kernel 2.6.18.x) as rescue disk) to copy over the data from the degraded one. I only touched those blank 500 GB disks but maybe SUSE messed up things because the kernel recognized one of the disks twice (once as SATA (/dev/sd..) and once as PATA shadow (/dev/hd..; not really accessible and not a duplicate of any existing driver)). This may have caused the problem I now have: The 4 x 300 RAID can not be assembled anymore. mdadm --assemble --verbose --no-degraded /dev/md5 /dev/hdc1 /dev/sdb1 /dev/sdc1 /dev/sdd1 mdadm: looking for devices for /dev/md5 mdadm: /dev/hdc1 is identified as a member of /dev/md5, slot 2. mdadm: /dev/sdb1 is identified as a member of /dev/md5, slot 3. mdadm: /dev/sdc1 is identified as a member of /dev/md5, slot 0. mdadm: /dev/sdd1 is identified as a member of /dev/md5, slot 1. mdadm: added /dev/sdd1 to /dev/md5 as 1 mdadm: failed to add /dev/hdc1 to /dev/md5: Invalid argument mdadm: failed to add /dev/sdb1 to /dev/md5: Invalid argument mdadm: failed to add /dev/sdc1 to /dev/md5: Invalid argument mdadm: /dev/md5 assembled from 0 drives (out of 4), but not started. Detaching sdb (the faulty disk) does not make any difference. When looking at the superblocks only the sdd1 superblock looks ok but sdb1, sdc1 and hdc1 look weird: /dev/hdc1: Magic : a92b4efc Version : 00.90.03 UUID : 5bf2ddc1:c64ab6ba:7364bdad:c081d4e6 Creation Time : Fri Jan 20 23:24:21 2006 Raid Level : raid5 Device Size : 281145408 (268.12 GiB 287.89 GB) Array Size : 843436224 (804.36 GiB 863.68 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 0 Update Time : Tue Mar 27 22:00:53 2007 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Checksum : 9a0111e7 - correct Events : 0.649118 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 2212 active sync /dev/hdc1 0 0 8 650 active sync 1 1 001 faulty removed 2 2 2212 active sync /dev/hdc1 3 3 8 493 active sync /dev/sdd1 /dev/sdb1: Magic : a92b4efc Version : 00.90.03 UUID : 5bf2ddc1:c64ab6ba:7364bdad:c081d4e6 Creation Time : Fri Jan 20 23:24:21 2006 Raid Level : raid5 Device Size : 281145408 (268.12 GiB 287.89 GB) Array Size : 843436224 (804.36 GiB 863.68 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 0 Update Time : Tue Mar 27 22:00:53 2007 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Checksum : 9a01120b - correct Events : 0.649118 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 493 active sync /dev/sdd1 - what's this? why SDD? This is SDB! also, this is the faulty device! 0 0 8 650 active sync 1 1 001 faulty removed 2 2 2212 active sync /dev/hdc1 3 3 8 493 active sync /dev/sdd1 /dev/sdc1: Magic : a92b4efc Version : 00.90.03 UUID : 5bf2ddc1:c64ab6ba:7364bdad:c081d4e6 Creation Time : Fri Jan 20 23:24:21 2006 Raid Level : raid5 Device Size : 281145408 (268.12 GiB 287.89 GB) Array Size : 843436224 (804.36 GiB 863.68 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 0 Update Time : Tue Mar 27 22:00:53 2007 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Checksum : 9a011215 - correct Events : 0.649118 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 650 active sync- where is the device name? 0 0 8 650 active sync 1 1 001 faulty removed 2 2 2212 active sync /dev/hdc1 3 3 8 493 active sync /dev/sdd1 /dev/sdd1: Magic : a92b4efc Version : 00.90.03 UUID : 5bf2ddc1:c64ab6ba:7364bdad:c081d4e6 Creation Time : Fri Jan 20 23:24:21 2006 Raid Level : raid5 Device Size : 281145408 (268.12 GiB 287.89 GB) Array Size : 843436224
Re: Moron Destroyed RAID6 Array Superblocks
After reading through the linux-raid archives I have been lead to believe that I can recover my array by doing a --create and listing the 7 drives in the exact order I originally created them in. Is this correct? If so, the kernel upgrade managed to shuffle the drive names around...is there any way I can figure out what order they should be in? If you know the old order just boot the old kernel and recreate there If you are using kernel rpms, you always want to 'install' new kernels and not 'upgrade', that keeps both the old and new kernel available. Otherwise, if you pick just 5 for the 7 drives from a raid 6 array it seems to me you have 5 factorial combinations = 120. You could write a script that tries them all and then tests by trying a read-only mount? Good luck. I mashed a 3.5 TB array once by rerunning the lvm creation which zeros the first chunk of the resulting device. I was used to mdadm not hurting user data when creating. Grrr. BTW when you test recreate the array make sure you dont start syncing, thats another reason to just use 5 drives. HTH - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Moron Destroyed RAID6 Array Superblocks
Ok--I'm a moron. Long story short, I was messing around with my RAID6 array and I managed to screw up two of the drives in my 7-drive 1-spare array. I had problems in the middle of a kernel upgrade and I kept getting errors about various drives having bad superblocks. So without knowing much about what I was doing, I did a --zero-superblock on all the drives in the array. So now I can't reassemble the array. After reading through the linux-raid archives I have been lead to believe that I can recover my array by doing a --create and listing the 7 drives in the exact order I originally created them in. Is this correct? If so, the kernel upgrade managed to shuffle the drive names around...is there any way I can figure out what order they should be in? -A - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Hot add not supported by version 1.x superblocks?
Hello Neil. I have a RAID1 built using v1.2 superblocks. I seem to not be able to hot add a spare drive to this array. I get the error message: HOT_ADD may only be used with version-0 superblocks Does this really mean that hot add is not supported for arrays built with v1.x superblocks? If so this seems like a big regression in functionality. In this case, are there technical reasons why v1.x cannot support hot add? I am using a stock 2.6.18 kernel with mdadm v2.5.4 Thanks, Sean __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hot add not supported by version 1.x superblocks?
False alarm. My apologies. There was a stale superblock on the disk I was trying to hot add as a spare. Once I wiped the superblock it was successfully added to the array. That being said, the error message issued wasn't exactly appropriate. Regards, Sean - Original Message From: Sean Puttergill [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: linux-raid@vger.kernel.org Sent: Wednesday, January 31, 2007 10:58:23 AM Subject: Hot add not supported by version 1.x superblocks? Hello Neil. I have a RAID1 built using v1.2 superblocks. I seem to not be able to hot add a spare drive to this array. I get the error message: HOT_ADD may only be used with version-0 superblocks Does this really mean that hot add is not supported for arrays built with v1.x superblocks? If so this seems like a big regression in functionality. In this case, are there technical reasons why v1.x cannot support hot add? I am using a stock 2.6.18 kernel with mdadm v2.5.4 Thanks, Sean __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
type-0.90.0 superblock and md-1 superblocks
Hello, I wonder what is the type-0.90.0 superblock and md-1 superblocks of md and what is the differnce between them? Is it merely some version of kernel md driver ? In case it is so - is there a way to know which is the version of the md superblocks which is in use in my kernel ? I see that in drivers/md/md.c there is an array of super_type with 2 elements, the nameof the first is 0.90.0,and the name of the second is md-1. when running : mdadm --detail /dev/md0 I see: /dev/md0: Version : 00.90.01 .. is this Version field is the superblock type ? Regards, MR - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: type-0.90.0 superblock and md-1 superblocks
On Monday October 9, [EMAIL PROTECTED] wrote: Hello, I wonder what is the type-0.90.0 superblock and md-1 superblocks of md and what is the differnce between them? Is it merely some version of kernel md driver ? Two different formats for the metadata describing the array. I usually refer to them as v0.90 and v1. Some of the differences are subtle. Some of the less subtle are: - v0.90 is limited to 28 components in an array. v1 allows 256. - v1 supports restarting driver recovery that was interrupted by a clean shutdown. v0.90 doesn't. - v1 can easily be moved between machines with different byte-order. v0.90 needs explicit conversion. - v0.90 records a small number to identify the array. v1 stores a textual name. - v0.90 can be used with 'in kernel autodetect' (i.e. partition type 0xfd). v1 cannot (I consider this an improvement :-) - v0.90 can get confused if a partition and a whole device are aligned so that the superblock would be at exactly the same location. v1 avoids this confusion. In case it is so - is there a way to know which is the version of the md superblocks which is in use in my kernel ? 2.6 kernels support both, though there have been a number of v1 specific bugs slowly being ironed out over the months. From 2.6.18 I consider v1 perfectly stable for regular use. In earlier kernels it was quite usable but there were a few minor bugs. mdadm create v0.90 by default, though the default can be changed in /etc/mdadm.conf. I see that in drivers/md/md.c there is an array of super_type with 2 elements, the name of the first is 0.90.0,and the name of the second is md-1. when running : mdadm --detail /dev/md0 I see: /dev/md0: Version : 00.90.01 .. is this Version field is the superblock type ? The 01 at the end is really a version number for the driver rather than a version number for the metadata (aka superblock) - md version numbers were not thought out probably and so are confusing. Yes, the Version reported by mdadm --detail reflects the superblock type. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question regarding superblocks
Phenomenal, thank you so much! This worked perfectly!! - Norm On 8/27/06, Neil Brown [EMAIL PROTECTED] wrote: On Sunday August 27, [EMAIL PROTECTED] wrote: Hi, I created a raid5 array with 3 devices. Unfortunately, I carelessly used the --zero-superblock option with mdadm to remove all of the superblock. I still have the mdadm.conf file which has the uuid in it. Is there a way for me to rebuild the superblocks or recover the data on these drives? Just recreate the array with the same devices in the same order and the same chunk size. All your data should still be there. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: zeroing old superblocks upgrading...
On Friday July 7, [EMAIL PROTECTED] wrote: Neil But if you wanted to (and were running a fairly recent kernel) you Neil could Neil mdadm --grow --bitmap=internal /dev/md0 Did this. And now I can do mdadm -X /dev/hde1 to examine the bitmap, but I think this totally blows. To create a bitmap, I add it to an md# device, but to examine it, I have to know which sub-devices to query? That's really not what I would expect. Hmm... I can see that. -X is a counterpart to -E They are both letters from EXamine. They both look at a component and tell you want is there. -D (--detail) reports on the array. Maybe it should be give more details about any bitmap Neil mdadm --grow --bitmap=none /dev/md0 Neil and it should work with minimal resync... So why would I want to remove the bitmap? To return you to the state you started in. You didn't have a bitmap to start with, so I gave you a recipe that left you with no bitmap. I didn't want to assume anything about whether you want a bitmap or not. Ofcouse, if you want to leave the bitmap there, that is fine. Possibly even a good idea. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
zeroing old superblocks upgrading...
Neil, First off, thanks for all your hard work on this software, it's really a great thing to have. But I've got some interesting issues here. Though not urgent. As I've said in other messages, I've got a pair of 120gb HDs mirrored. I'm using MD across partitions, /dev/hde1 and /dev/hdg1. Works nicely. But I see that I have an old superblock sitting around on /dev/hde (notice, no partition here!) which I'd like to clean up. # mdadm -E /dev/hde /dev/hde: Magic : a92b4efc Version : 00.90.00 UUID : 9835ebd0:5d02ebf0:907edc91:c4bf97b2 Creation Time : Fri Oct 24 19:11:02 2003 Raid Level : raid1 Device Size : 117220736 (111.79 GiB 120.03 GB) Array Size : 117220736 (111.79 GiB 120.03 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Update Time : Fri Oct 24 19:21:59 2003 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 79d2a6fd - correct Events : 0.2 Number Major Minor RaidDevice State this 0 300 active sync /dev/hda 0 0 300 active sync /dev/hda 1 1 001 faulty Here's the correct ones: # mdadm -E /dev/hde1 /dev/hde1: Magic : a92b4efc Version : 00.90.00 UUID : 2e078443:42b63ef5:cc179492:aecf0094 Creation Time : Fri Oct 24 19:23:41 2003 Raid Level : raid1 Device Size : 117218176 (111.79 GiB 120.03 GB) Array Size : 117218176 (111.79 GiB 120.03 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Update Time : Thu Jul 6 18:21:08 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 210069e5 - correct Events : 0.7762540 Number Major Minor RaidDevice State this 0 3310 active sync /dev/hde1 0 0 3310 active sync /dev/hde1 1 1 3411 active sync /dev/hdg1 I can't seem to zero it out: # mdadm --misc --zero-superblock /dev/hde mdadm: Couldn't open /dev/hde for write - not zeroing Should I just ignore this, or should I break off /dev/hde from the array and scrub the disk and then re-add it back in? Also, can I upgrade my superblock to the latest version with out any problems? Thanks, John - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: zeroing old superblocks upgrading...
On Thursday July 6, [EMAIL PROTECTED] wrote: Neil, First off, thanks for all your hard work on this software, it's really a great thing to have. But I've got some interesting issues here. Though not urgent. As I've said in other messages, I've got a pair of 120gb HDs mirrored. I'm using MD across partitions, /dev/hde1 and /dev/hdg1. Works nicely. But I see that I have an old superblock sitting around on /dev/hde (notice, no partition here!) which I'd like to clean up. ... I can't seem to zero it out: # mdadm --misc --zero-superblock /dev/hde mdadm: Couldn't open /dev/hde for write - not zeroing Should I just ignore this, or should I break off /dev/hde from the array and scrub the disk and then re-add it back in? You could ignore it - it shouldn't hurt. But if you wanted to (and were running a fairly recent kernel) you could mdadm --grow --bitmap=internal /dev/md0 mdadm /dev/md0 --fail /dev/hde1 --remove /dev/hde1 mdadm --zero-superblock /dev/hde mdadm /dev/md0 --add /dev/hde1 mdadm --grow --bitmap=none /dev/md0 and it should work with minimal resync... Though thinking about it - after the first --grow, check that the unwanted bitmap is still there. It is quite possible that the internal bitmap will over-write the unwanted superblock (depending on the exact size and aligment of hde1 compared with hde). If it is gone, then don't bother with the rest of the sequence. Also, can I upgrade my superblock to the latest version with out any problems? The only problem with superblock version numbers is that they are probably confusing. If you don't worry about them, they should just do the right thing. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 002 of 12] md: Set desc_nr correctly for version-1 superblocks.
This has to be done in -load_super, not -validate_super Without this, hot-adding devices to an array doesn't always work right - though there is a work around in mdadm-2.5.2 to make this less of an issue. ### Diffstat output ./drivers/md/md.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c 2006-06-27 12:15:17.0 +1000 +++ ./drivers/md/md.c 2006-06-27 12:17:32.0 +1000 @@ -1064,6 +1064,11 @@ static int super_1_load(mdk_rdev_t *rdev if (rdev-sb_size bmask) rdev- sb_size = (rdev-sb_size | bmask)+1; + if (sb-level == cpu_to_le32(LEVEL_MULTIPATH)) + rdev-desc_nr = -1; + else + rdev-desc_nr = le32_to_cpu(sb-dev_number); + if (refdev == 0) ret = 1; else { @@ -1173,7 +1178,6 @@ static int super_1_validate(mddev_t *mdd } if (mddev-level != LEVEL_MULTIPATH) { int role; - rdev-desc_nr = le32_to_cpu(sb-dev_number); role = le16_to_cpu(sb-dev_roles[rdev-desc_nr]); switch(role) { case 0x: /* spare */ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Two disk failure in RAID5 during resync, wrong superblocks
Hi again, I've just seen I still had a wrong superblock in the subject of my mail. Please just ignore, I fixed that while writing the last mail and forgot to remove it. :) Greets, Frank signature.asc Description: Digital signature
[PATCH 003 of 3] md: Make sure rdev-size gets set for version-1 superblocks.
Sometimes it doesn't so make the code more like the version-0 code which works. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2006-02-02 16:51:46.0 +1100 +++ ./drivers/md/md.c 2006-02-02 16:56:12.0 +1100 @@ -1025,7 +1025,7 @@ static int super_1_load(mdk_rdev_t *rdev rdev- sb_size = (rdev-sb_size | bmask)+1; if (refdev == 0) - return 1; + ret = 1; else { __u64 ev1, ev2; struct mdp_superblock_1 *refsb = @@ -1045,7 +1045,9 @@ static int super_1_load(mdk_rdev_t *rdev ev2 = le64_to_cpu(refsb-events); if (ev1 ev2) - return 1; + ret = 1; + else + ret = 0; } if (minor_version) rdev-size = ((rdev-bdev-bd_inode-i_size9) - le64_to_cpu(sb-data_offset)) / 2; @@ -1059,7 +1061,7 @@ static int super_1_load(mdk_rdev_t *rdev if (le32_to_cpu(sb-size) rdev-size*2) return -EINVAL; - return 0; + return ret; } static int super_1_validate(mddev_t *mddev, mdk_rdev_t *rdev) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 002 of 5] md: Make sure array geometry changes persist with version-1 superblocks.
super_1_sync only updates fields in the superblock that might have changed. 'raid_disks' and 'size' could have changed, but this information doesn't get updated until this patch. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c |3 +++ 1 file changed, 3 insertions(+) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2006-01-24 13:32:25.0 +1100 +++ ./drivers/md/md.c 2006-01-24 13:42:29.0 +1100 @@ -1162,6 +1162,9 @@ static void super_1_sync(mddev_t *mddev, sb-cnt_corrected_read = atomic_read(rdev-corrected_errors); + sb-raid_disks = cpu_to_le32(mddev-raid_disks); + sb-size = cpu_to_le64(mddev-size); + if (mddev-bitmap mddev-bitmap_file == NULL) { sb-bitmap_offset = cpu_to_le32((__u32)mddev-bitmap_offset); sb-feature_map = cpu_to_le32(MD_FEATURE_BITMAP_OFFSET); - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH md 006 of 8] Allow hot-adding devices to arrays with non-persistant superblocks.
It is possibly (and occasionally useful) to have a raid1 without persistent superblocks. The code in add_new_disk for adding a device to such an array always tries to read a superblock. This will obviously fail. So do the appropriate test and call md_import_device with appropriate args. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2005-08-22 11:48:37.0 +1000 +++ ./drivers/md/md.c 2005-08-22 11:49:47.0 +1000 @@ -2225,8 +2225,11 @@ static int add_new_disk(mddev_t * mddev, mdname(mddev)); return -EINVAL; } - rdev = md_import_device(dev, mddev-major_version, - mddev-minor_version); + if (mddev-persistent) + rdev = md_import_device(dev, mddev-major_version, + mddev-minor_version); + else + rdev = md_import_device(dev, -1, -1); if (IS_ERR(rdev)) { printk(KERN_WARNING md: md_import_device returned %ld\n, - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Broken superblocks and --force
Joel, On Fri, 6 Oct 2000, Joel Becker wrote: Please reply personally. I just had my kernel go nuts in do_try_to_free_pages(), and when I bounced the box (sysrq sync + reboot) What kernel were you using? I assume you're aware of this bug in 2.2.15/16... I dropped back to 14 to stop this from happening, but am currently testing 2.2.17 (final) to see if that's OK. the array came back with a bad superblock, so it wont start. I'm starting to think (quick source glance) that mkraid will recover the superblocks and not wipe all the data with the force flag, but I'm not sure. Your superblock should /NOT/ have been corrupted. Can you post more information about this? IIRC all that should have happened is it should have fsck'd one of the disks (the first, I think?), and then resync'd that to the other one. Two disk RAID1. Running fine before the reboot. I'd like to get the data back. Thanks for the prompt reply. Can you post what you're getting on startup? Regards, Corin /+-\ | Corin Hartland-Swann | Direct: +44 (0) 20 7544 4676| | Commerce Internet Ltd | Mobile: +44 (0) 79 5854 0027| | 22 Cavendish Buildings |Tel: +44 (0) 20 7491 2000| | Gilbert Street |Fax: +44 (0) 20 7491 2010| | Mayfair|Web: http://www.commerce.uk.net/ | | London W1K 5HJ | E-Mail: [EMAIL PROTECTED]| \+-/ - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED]
Re: Broken superblocks and --force
What do the boot= and root= lines in your lilo.conf look like? They should point to the root MD device, and not to one of the individual disks, eg: boot=/dev/md0 ... image=/boot/vmlinuz root=/dev/md0 The only thing I can think is that it only knows about one of the disks at boot time, and that's why it's barfing. The method above means that it will also install the boot block onto both disks, instead of just the first. You need a patched version of lilo to work with MD devices like this - I'm afraid I don't know where to get it because I use Mandrake and it comes pre-patched. Regular lilo will work just fine. See the examples in Boot+Root+Raid+Lilo HOWTO http://ftp.bizsystems.net/pub/raid/Boot+Root+Raid+Lilo.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED]