array doesn't run even with --force
I've got a raid5 array with 5 disks where 2 failed. The failures are occasional and only on a few sectors so I tried to assemble it with 4 disks anyway: # mdadm -A -f -R /dev/mdnumber /dev/disk1 /dev/disk2 /dev/disk3 /dev/disk4 However mdadm complains that one of the disks has an out-of-date superblock and kicks it out, and then it cannot run the array with only 3 disks. Shouldn't it adjust the superblock and assemble-run it anyway? That's what -f is for, no? This is with kernel 2.6.22.16 and mdadm 2.6.4. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: array doesn't run even with --force
On Sunday January 20, [EMAIL PROTECTED] wrote: I've got a raid5 array with 5 disks where 2 failed. The failures are occasional and only on a few sectors so I tried to assemble it with 4 disks anyway: # mdadm -A -f -R /dev/mdnumber /dev/disk1 /dev/disk2 /dev/disk3 /dev/disk4 However mdadm complains that one of the disks has an out-of-date superblock and kicks it out, and then it cannot run the array with only 3 disks. Shouldn't it adjust the superblock and assemble-run it anyway? That's what -f is for, no? This is with kernel 2.6.22.16 and mdadm 2.6.4. Please provide actual commands and actual output. Also add --verbose to the assemble command Also provide --examine for all devices. Also provide any kernel log messages. Thanks, NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: array doesn't run even with --force
Neil Brown ([EMAIL PROTECTED]) wrote on 21 January 2008 12:13: On Sunday January 20, [EMAIL PROTECTED] wrote: I've got a raid5 array with 5 disks where 2 failed. The failures are occasional and only on a few sectors so I tried to assemble it with 4 disks anyway: # mdadm -A -f -R /dev/mdnumber /dev/disk1 /dev/disk2 /dev/disk3 /dev/disk4 However mdadm complains that one of the disks has an out-of-date superblock and kicks it out, and then it cannot run the array with only 3 disks. Shouldn't it adjust the superblock and assemble-run it anyway? That's what -f is for, no? This is with kernel 2.6.22.16 and mdadm 2.6.4. Please provide actual commands and actual output. Also add --verbose to the assemble command Also provide --examine for all devices. Also provide any kernel log messages. The command is mdadm -A --verbose -f -R /dev/md3 /dev/sda4 /dev/sdc4 /dev/sde4 /dev/sdd4 The failed areas are sdb4 (which I didn't include above) and sdd4. I did a dd if=/dev/sdb4 of=/dev/hda4 bs=512 conv=noerror and it complained about roughly 10 bad sectors. I did dd if=/dev/sdd4 of=/dev/hdc4 bs=512 conv=noerror and there were no errors, that's why I used sdd4 above. I tried to substitute hdc4 for sdd4, and hda4 for sdb4, to no avail. I don't have kernel logs because the failed area has /home and /var. The double fault occurred during the holidays, so I don't know which happened first. Below are the output of the command above and of --examine. mdadm: looking for devices for /dev/md3 mdadm: /dev/sda4 is identified as a member of /dev/md3, slot 0. mdadm: /dev/sdc4 is identified as a member of /dev/md3, slot 2. mdadm: /dev/sde4 is identified as a member of /dev/md3, slot 4. mdadm: /dev/sdd4 is identified as a member of /dev/md3, slot 5. mdadm: no uptodate device for slot 1 of /dev/md3 mdadm: added /dev/sdc4 to /dev/md3 as 2 mdadm: no uptodate device for slot 3 of /dev/md3 mdadm: added /dev/sde4 to /dev/md3 as 4 mdadm: added /dev/sdd4 to /dev/md3 as 5 mdadm: added /dev/sda4 to /dev/md3 as 0 mdadm: failed to RUN_ARRAY /dev/md3: Input/output error mdadm: Not enough devices to start the array. On screen it shows kicking out of date... for sdd4. /dev/sda4: Magic : a92b4efc Version : 00.90.00 UUID : 2f2f8327:375b4306:94521055:e3dc373b Creation Time : Tue May 11 16:03:35 2004 Raid Level : raid5 Used Dev Size : 70454400 (67.19 GiB 72.15 GB) Array Size : 281817600 (268.76 GiB 288.58 GB) Raid Devices : 5 Total Devices : 4 Preferred Minor : 3 Update Time : Wed Jan 16 16:00:53 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 2 Spare Devices : 0 Checksum : 16119868 - correct Events : 0.14967284 Layout : left-symmetric Chunk Size : 128K Number Major Minor RaidDevice State this 0 840 active sync /dev/sda4 0 0 840 active sync /dev/sda4 1 1 001 active sync - note the difference compared to sdc4 2 2 8 362 active sync /dev/sdc4 3 3 003 faulty removed 4 4 8 684 active sync /dev/sde4 /dev/sdc4: Magic : a92b4efc Version : 00.90.00 UUID : 2f2f8327:375b4306:94521055:e3dc373b Creation Time : Tue May 11 16:03:35 2004 Raid Level : raid5 Used Dev Size : 70454400 (67.19 GiB 72.15 GB) Array Size : 281817600 (268.76 GiB 288.58 GB) Raid Devices : 5 Total Devices : 4 Preferred Minor : 3 Update Time : Wed Jan 16 16:00:53 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 2 Spare Devices : 0 Checksum : 1611988f - correct Events : 0.14967284 Layout : left-symmetric Chunk Size : 128K Number Major Minor RaidDevice State this 2 8 362 active sync /dev/sdc4 0 0 840 active sync /dev/sda4 1 1 001 faulty removed 2 2 8 362 active sync /dev/sdc4 3 3 003 faulty removed 4 4 8 684 active sync /dev/sde4 /dev/sdd4: Magic : a92b4efc Version : 00.90.00 UUID : 2f2f8327:375b4306:94521055:e3dc373b Creation Time : Tue May 11 16:03:35 2004 Raid Level : raid5 Used Dev Size : 70454400 (67.19 GiB 72.15 GB) Array Size : 281817600 (268.76 GiB 288.58 GB) Raid Devices : 5 Total Devices : 5 Preferred Minor : 3 Update Time : Fri Jan 11 18:45:17 2008 State : clean Active Devices : 3 Working Devices : 4 Failed Devices : 2 Spare Devices : 1 Checksum : 160b27ce - correct Events : 0.14967266 Layout : left-symmetric Chunk Size : 128K Number Major Minor RaidDevice
Re: array doesn't run even with --force
On Monday January 21, [EMAIL PROTECTED] wrote: The command is mdadm -A --verbose -f -R /dev/md3 /dev/sda4 /dev/sdc4 /dev/sde4 /dev/sdd4 The failed areas are sdb4 (which I didn't include above) and sdd4. I did a dd if=/dev/sdb4 of=/dev/hda4 bs=512 conv=noerror and it complained about roughly 10 bad sectors. I did dd if=/dev/sdd4 of=/dev/hdc4 bs=512 conv=noerror and there were no errors, that's why I used sdd4 above. I tried to substitute hdc4 for sdd4, and hda4 for sdb4, to no avail. I don't have kernel logs because the failed area has /home and /var. The double fault occurred during the holidays, so I don't know which happened first. Below are the output of the command above and of --examine. mdadm: looking for devices for /dev/md3 mdadm: /dev/sda4 is identified as a member of /dev/md3, slot 0. mdadm: /dev/sdc4 is identified as a member of /dev/md3, slot 2. mdadm: /dev/sde4 is identified as a member of /dev/md3, slot 4. mdadm: /dev/sdd4 is identified as a member of /dev/md3, slot 5. mdadm: no uptodate device for slot 1 of /dev/md3 mdadm: added /dev/sdc4 to /dev/md3 as 2 mdadm: no uptodate device for slot 3 of /dev/md3 mdadm: added /dev/sde4 to /dev/md3 as 4 mdadm: added /dev/sdd4 to /dev/md3 as 5 mdadm: added /dev/sda4 to /dev/md3 as 0 mdadm: failed to RUN_ARRAY /dev/md3: Input/output error mdadm: Not enough devices to start the array. So no device claim to be member '1' or '3' of the array, and as you cannot start an array with 2 devices missing, there is nothing that mdadm can do. It has no way of knowing what should go in as '1' or '3'. As you note, sda4 says that it thinks slot 1 is still active/sync, but it doesn't seem to know which device should go there either. However that does indicate that slot 3 failed first and slot 1 failed later. So if we have candidates for both, slot 1 is probably more uptodate. You need to tell mdadm what goes where by creating the array. e.g. if you think that sdb4 is adequately reliable and that it was in slot 1, then mdadm -C /dev/md3 -l5 -n5 -c 128 /dev/sda4 /dev/sdb4 /dev/sdc4 missing /dev/sde4 alternately if you think it best to use sdd, and it was in slot 3, then mdadm -C /dev/md3 -l5 -n5 -c 128 /dev/sda4 missing /dev/sdc4 /dev/sdd4 /dev/sde4 would be the command to use. Note that this command will not touch any data. It will just overwrite the superblock and assemble the array. You can then 'fsck' or whatever to confirm that the data looks good. good luck. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: array doesn't run even with --force
Neil Brown ([EMAIL PROTECTED]) wrote on 21 January 2008 14:09: As you note, sda4 says that it thinks slot 1 is still active/sync, but it doesn't seem to know which device should go there either. However that does indicate that slot 3 failed first and slot 1 failed later. So if we have candidates for both, slot 1 is probably more uptodate. I was going home (it's 1h20 past midnight) when I remembered and came back to write that assembling with /dev/sda4 /dev/sdb4 /dev/sdc4 missing /dev/sde4 works, which confirms what you say. Adding sdd4 back it starts resyncing, however since sdb4 has errors, a double fault happens again and the array fails. You need to tell mdadm what goes where by creating the array. e.g. if you think that sdb4 is adequately reliable and that it was in slot 1, then mdadm -C /dev/md3 -l5 -n5 -c 128 /dev/sda4 /dev/sdb4 /dev/sdc4 missing /dev/sde4 alternately if you think it best to use sdd, and it was in slot 3, then mdadm -C /dev/md3 -l5 -n5 -c 128 /dev/sda4 missing /dev/sdc4 /dev/sdd4 /dev/sde4 would be the command to use. Note that this command will not touch any data. It will just overwrite the superblock and assemble the array. You can then 'fsck' or whatever to confirm that the data looks good. I have two possibilities: use sdd4 in slot 3 or the dump of sdb4 in another disk in slot 1. This copy is more recent but has errors. Is it possible to know which would be less bad before I fsck? - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html