Re: question : raid bio sector size
I was refering to bios reaching make_request in raid5.c . I would be more precise. I am dd'ing dd if=/dev/md1 of=/dev/zero bs=1M count=1 skip=10 I have added the following printk in make_request printk (%d:,bio-bi_size) I am getting sector sizes. 512:512:512:512:512 I suppose they gathered in the elevator, but still why so small ? thank you raz. On 3/27/06, Neil Brown [EMAIL PROTECTED] wrote: On Monday March 27, [EMAIL PROTECTED] wrote: i have playing with raid5 and i noticed that the arriving bios sizes are 1 sector. why is that and where is it set ? bios arriving from where? bios from the filesystem to the raid5 device will be whatever size the fs wants to make them. bios from the raid5 device to the component devices will always be 1 page (typically 8 sectors). This is the size used by the stripe cache which is used to synchronise everything. NeilBrown -- Raz - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: question : raid bio sector size
On Wednesday March 29, [EMAIL PROTECTED] wrote: I was refering to bios reaching make_request in raid5.c . I would be more precise. I am dd'ing dd if=/dev/md1 of=/dev/zero bs=1M count=1 skip=10 I have added the following printk in make_request printk (%d:,bio-bi_size) I am getting sector sizes. 512:512:512:512:512 I suppose they gathered in the elevator, but still why so small ? Odd.. When I try that I get 4096 repeatedly. Which kernel are you using? What does blockdev --getbsz /dev/md1 say? Do you have a filesystem mounted on /dev/md1? If so, what sort of filesystem. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
(X)FS corruption on 2 SATA disk RAID 1
Hello, list, I think, this is generally hardware error, but looks like software problem too. At this point there is no dirty data in memory! Cheers, Janos [EMAIL PROTECTED] /]# cmp -b /dev/sda1 /dev/sdb1 /dev/sda1 /dev/sdb1 differ: byte 68881481729, line 308395510 is 301 M-A 74 [EMAIL PROTECTED] /]# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [faulty] md10 : active raid1 sdb1[1] sda1[0] 136729088 blocks [2/2] [UU] bitmap: 0/131 pages [0KB], 512KB chunk unused devices: none [EMAIL PROTECTED] /]# mount 192.168.0.1://NFS/ROOT-BASE/ on / type nfs (rw,hard,rsize=8192,wsize=8192,timeo= 5,retrans=0,actimeo=1) none on /proc type proc (rw,noexec,nosuid,nodev) none on /dev/pts type devpts (rw,gid=5,mode=620) none on /dev/shm type tmpfs (rw) none on /sys type sysfs (rw) /dev/ram0 on /mnt/fast type ext2 (rw) none on /dev/cpuset type cpuset (rw) /dev/md10 on /mnt/1 type xfs (ro) [EMAIL PROTECTED] /]# cut from log: Mar 29 08:14:45 dy-xeon-1 kernel: scsi1 : ata_piix Mar 29 08:14:45 dy-xeon-1 kernel: Vendor: ATA Model: WDC WD2000JD-19H Rev: 08.0 Mar 29 08:14:45 dy-xeon-1 kernel: Type: Direct-Access ANSI SCSI revision: 05 Mar 29 08:14:45 dy-xeon-1 kernel: Vendor: ATA Model: WDC WD2000JD-19H Rev: 08.0 Mar 29 08:14:45 dy-xeon-1 kernel: Type: Direct-Access ANSI SCSI revision: 05 Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB) Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sda: drive cache: write back Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB) Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sda: drive cache: write back Mar 29 08:14:45 dy-xeon-1 kernel: sda: sda1 sda2 Mar 29 08:14:45 dy-xeon-1 kernel: sd 0:0:0:0: Attached scsi disk sda Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sdb: 390721968 512-byte hdwr sectors (200050 MB) Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sdb: drive cache: write back Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sdb: 390721968 512-byte hdwr sectors (200050 MB) Mar 29 08:14:45 dy-xeon-1 kernel: SCSI device sdb: drive cache: write back Mar 29 08:14:45 dy-xeon-1 kernel: sdb: sdb1 sdb2 Mar 29 08:14:45 dy-xeon-1 kernel: sd 1:0:0:0: Attached scsi disk sdb Mar 29 08:14:45 dy-xeon-1 kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0 Mar 29 08:14:45 dy-xeon-1 kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0 Smart logs: sda: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 200 200 051Pre-fail s - 0 3 Spin_Up_Time0x0007 130 124 021Pre-fail s - 6025 4 Start_Stop_Count0x0032 100 100 040Old_age ys - 97 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail s - 0 7 Seek_Error_Rate 0x000b 200 200 051Pre-fail s - 0 9 Power_On_Hours 0x0032 089 089 000Old_age ys - 8047 10 Spin_Retry_Count0x0013 100 253 051Pre-fail s - 0 11 Calibration_Retry_Count 0x0013 100 253 051Pre-fail s - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age ys - 97 194 Temperature_Celsius 0x0022 120 111 000Old_age ys - 30 196 Reallocated_Event_Count 0x0032 200 200 000Old_age ys - 0 197 Current_Pending_Sector 0x0012 200 200 000Old_age ys - 0 198 Offline_Uncorrectable 0x0012 200 200 000Old_age ys - 0 199 UDMA_CRC_Error_Count0x000a 200 253 000Old_age ys - 0 200 Multi_Zone_Error_Rate 0x0009 200 200 051Pre-fail Offline - 0 SMART Error Log Version: 1 No Errors Logged sdb: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 200 200 051Pre-fail s - 0 3 Spin_Up_Time0x0007 127 120 021Pre-fail s - 6175 4 Start_Stop_Count0x0032 100 100 040Old_age ys - 94 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail s - 0 7 Seek_Error_Rate 0x000b 200 200 051Pre-fail s - 0 9 Power_On_Hours 0x0032 089 089 000Old_age ys - 8065 10 Spin_Retry_Count0x0013 100 253 051Pre-fail s - 0 11 Calibration_Retry_Count 0x0013 100 253 051Pre-fail s - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age ys - 94 194 Temperature_Celsius 0x0022 117 109 000Old_age ys - 33
Re: question : raid bio sector size
man .. very very good. blockdev --getsz says 512. On 3/29/06, Neil Brown [EMAIL PROTECTED] wrote: On Wednesday March 29, [EMAIL PROTECTED] wrote: I was refering to bios reaching make_request in raid5.c . I would be more precise. I am dd'ing dd if=/dev/md1 of=/dev/zero bs=1M count=1 skip=10 I have added the following printk in make_request printk (%d:,bio-bi_size) I am getting sector sizes. 512:512:512:512:512 I suppose they gathered in the elevator, but still why so small ? Odd.. When I try that I get 4096 repeatedly. Which kernel are you using? What does blockdev --getbsz /dev/md1 say? Do you have a filesystem mounted on /dev/md1? If so, what sort of filesystem. NeilBrown -- Raz - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: making raid5 more robust after a crash?
On Sat, Mar 18, 2006 at 08:13:48AM +1100, Neil Brown wrote: On Friday March 17, [EMAIL PROTECTED] wrote: Dear All, We have a number of machines running 4TB raid5 arrays. Occasionally one of these machines will lock up solid and will need power cycling. Often when this happens, the array will refuse to restart with 'cannot start dirty degraded array'. Usually mdadm --assemble --force will get the thing going again - although it will then do a complete resync. My question is: Is there any way I can make the array more robust? I don't mind it losing a single drive and having to resync when we get a lockup - but having to do a forced assemble always makes me nervous, and means that this sort of crash has to be escalated to a senior engineer. Why is the array degraded? Having a crash while the array is degraded can cause undetectable data loss. That is why md won't assemble the array itself: you need to know there could be a problem. But a crash with a degraded array should be fairly unusual. If it is happening a lot, then there must be something wrong with your config: either you are running degraded a lot (which is not safe, don't do it), or md cannot find all the devices to assemble. Thanks for your reply. As you guessed, this was a problem with our hardware/config and nothing to do with the raid software. After much investigation we found that we had two separate problems. The first of these was a SATA driver problem. This would occasionally return hard errors for a drive in the array, after which it would get kicked. The second was XFS over NFS using up too much kernel stack and hanging the machine. If both happened before we noticed (say during the night), the result would be one drive dirty because of the SATA driver and one dirty because of the lockup. The real sting in the tail is that (for some reason) the drive lost through the SATA problem would not be marked as dirty - so if the array was force rebuilt it would be used in place of the more recent failure - causing horrible synchronisation problems. Can anybody point me to the syntax I could use for saying: force rebuild the array using drives ABCD but not E, even though E looks fresh and D doesn't. ? Typical syslog: Mar 17 10:45:24 snap27 kernel: md: Autodetecting RAID arrays. Mar 17 10:45:24 snap27 kernel: raid5: cannot start dirty degraded array for md0 So where is 'disk 1' ?? Presumably it should be 'sdb1'. Does that drive exist? Is is marked for auto-detect like the others? Ok, this syslog was a complete red herring for the above problem - and you hit the nail right on the head - in this particular case I had installed a new sdb1 and forgot to set the autodetect flag :-) Chris. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: making raid5 more robust after a crash?
On Wednesday March 29, [EMAIL PROTECTED] wrote: Thanks for your reply. As you guessed, this was a problem with our hardware/config and nothing to do with the raid software. I'm glad you have found your problem! Can anybody point me to the syntax I could use for saying: force rebuild the array using drives ABCD but not E, even though E looks fresh and D doesn't. mdadm -Af /dev/mdX A B C D i.e. don't even tell mdadm about E. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
addendum: was Re: recovering data on a failed raid-0 installation
ok, guy and others. this is a followup to the case I am currently trying (still) to solve. synopsis: the general consensus is that raid0 writes in a striping fashion. However, the test case I have here doesn't appear to operate in the above described manner. what was observed was this: on /dev/mdo (while observing drive activity for both hda and hdb) hda was active until filled at which point data was spanned to hdb. In other words, the data was written in a linear, not striped, manner. given this behavior (as observed), it stands to reason that the data on the first of the 2 members of this raid should be recoverable, if only we could trick the raid into allowing us to mount it without its second member. at this point, we are assuming that the data on drive 2 (hdb) is not recoverable. In a scientific fashion, assuming that the observed behavior is correct, how would one go about recovering data from the first member without the second being present? I assume that we are going to have to use mdadm in such a way as to trick it into thinking it is doing something that it is not. I invite anyone here to setup a similar testing environment to confirm these results. drives: 2 identical IDE drives (same make/model) suse 9.3 os. p.s. I have heard all the naysayer commentary so please, keep it to USEFUL information only. thanks On Tuesday 28 March 2006 22:26, you wrote: RAID0 uses all disks evenly (all 2 in your case). I dont see how you can recover from a drive failure with a RAID0. Never use RAID0 unless you are willing to lose all the data! Are you sure the second disk is dead? Have you done a read test on the disk? dd works well for read testing. Try this: dd if=/dev/hdb2 of=/dev/null bs=64k or dd if=/dev/hdb of=/dev/null bs=64k Guy } -Original Message- } From: [EMAIL PROTECTED] [mailto:linux-raid- } [EMAIL PROTECTED] On Behalf Of Technomage } Sent: Wednesday, March 29, 2006 12:09 AM } To: linux-raid@vger.kernel.org } Subject: recovering data on a failed raid-0 installation } } ok, } here's the situation in a nutshell. } } one of the 2 HD's in a linux raid-0 installation has failed. } } Fortunately, or otherwise, it was NOT the primary HD. } } problem is, I need to recover data from the first drive but appear to be } unable to do so because the raid is not complete. the second drive only } had } 193 MB written to it and I am fairly certain that the data I would like to } recover is NOT on that drive. } } can anyone offer any solutions to this? } } the second HD is not usable (heat related failure issues). } } The filesystem used on the md0 partition (under mdadm) was xfs. now I have } tried the xfs_check and xfs_repair tools and they are not helpful at this } point. } } The developer (of mdadm) suggested I use the following commands in an } attempt } to recover: } } mdadm -C /dev/md0 -l0 -n2 /dev/.. } fsck -n /dev/md0 } } However, the second one was a no go. } } I am stumped as to how to proceed here. I need the data off the first } drive, } but do not appear to have any way (other than using cat to see it) to get } at } it. } } some help would be greatly appreciated. } } technomage } } p. here is the original response sent back to me from the developer of } mdadm: } *** } Re: should have been more explicit here - Re: need some help URGENT! } From: Neil Brown [EMAIL PROTECTED] } To: Technomage [EMAIL PROTECTED] } Date: Sunday 22:01:45 } On Sunday March 26, [EMAIL PROTECTED] wrote: } ok, } } you gave me more info than some local to that mentioned e-mail list. } } ok, the vast majority of the data I need to recover is on /dev/hda } and /dev/hdb only has 193 MB and is probably irrelevant. } } can you help me with this? } can you baby me through this. I really need to recover this data (if at } all } possible). } } Not really, and certainly not now (I have to go out). } I have already make 2 suggestions } mail linux-raid@vger.kernel.org } and } mdadm -C /dev/md0 -l0 -n2 /dev/.. } fsck -n /dev/md0 } } try one of those. } } NeilBrown } } } the friend of mine that this actually happened to is on the phone, } begging } me } and grovelling before the gods of linux in order to have this fixed. I } have } setup an identical test situation here. } } the important data is on drive 1 and drive 2 is mostly irrelevant. } is there any way to convince raid-0 to truncate to the end of drive 1 } and } allow me to get whatever data I can off. btw, the filesystem that was } formatted was xfs (for linux) on md0. } } if you have questions, please do not hesitate to ask. } } thank you. } } p. real name here is Eric. } } } On Sunday 26 March 2006 21:33, you wrote: } On Sunday March 26, [EMAIL PROTECTED] wrote: } } With a name like Technomage and a vague subject need some help } URGENT, I very really
ANNOUNCE: mdadm 2.4 - A tool for managing Soft RAID under Linux
I am pleased to announce the availability of mdadm version 2.4 It is available at the usual places: http://www.cse.unsw.edu.au/~neilb/source/mdadm/ and http://www.{countrycode}.kernel.org/pub/linux/utils/raid/mdadm/ mdadm is a tool for creating, managing and monitoring device arrays using the md driver in Linux, also known as Software RAID arrays. Release 2.4 primarily adds support for increasing the number of devices in a RAID5 array, which requires 2.6.17 (or some -rc or -mm prerelease). It also includes a number of minor functionality enhancements and documentation updates. Changelog Entries: - Rewrite 'reshape' support including performing a backup of the critical region for a raid5 growth, and restoring that backup after a crash. - Put a 'canary' at each end of the backup so a corruption can be more easily detected. - Remove useless 'ident' arguement from -getinfo_super method. - Support --backup-file for backing-up critical section during growth. - Erase old superblocks (of different versions) when creating new array. - Allow --monitor to work with arrays with 28 devices - Report reshape information in --detail - Handle symlinks in /dev better - Fix mess in --detail output which a device is missing. - Manpage tidyup - Support 'bitmap=' in mdadm.conf for auto-assembling arrays with write-intent bitmaps in separate files. - Updates to md.4 man page including section on RESTRIPING and SYSFS Development of mdadm is sponsored by SUSE Labs, Novell Inc. NeilBrown 30th March 2006 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: addendum: was Re: recovering data on a failed raid-0 installation
If what you say is true, then it was not a RAID0. It sounds like LINEAR. Do you have the original command used to create the array? Or the output from mdadm before you tried any recovery methods. The output must be from before you re-created the array. Output from commands like mdadm -D /dev/md0 or mdadm -E /dev/hda2. Or the output from cat /proc/mdstat, from before you re-created the array. Guy } -Original Message- } From: [EMAIL PROTECTED] [mailto:linux-raid- } [EMAIL PROTECTED] On Behalf Of Technomage } Sent: Wednesday, March 29, 2006 11:15 PM } To: Guy } Cc: linux-raid@vger.kernel.org } Subject: addendum: was Re: recovering data on a failed raid-0 installation } } ok, guy and others. } } this is a followup to the case I am currently trying (still) to solve. } } synopsis: } the general consensus is that raid0 writes in a striping fashion. } } However, the test case I have here doesn't appear to operate in the above } described manner. what was observed was this: on /dev/mdo (while observing } drive activity for both hda and hdb) hda was active until filled at which } point data was spanned to hdb. In other words, the data was written in a } linear, not striped, manner. } } given this behavior (as observed), it stands to reason that the data on } the } first of the 2 members of this raid should be recoverable, if only we } could } trick the raid into allowing us to mount it without its second member. } at } this point, we are assuming that the data on drive 2 (hdb) is not } recoverable. } } In a scientific fashion, assuming that the observed behavior is correct, } how } would one go about recovering data from the first member without the } second } being present? I assume that we are going to have to use mdadm in such a } way } as to trick it into thinking it is doing something that it is not. I } invite } anyone here to setup a similar testing environment to confirm these } results. } } drives: 2 identical IDE drives (same make/model) } suse 9.3 os. } } p.s. I have heard all the naysayer commentary so please, keep it to } USEFUL } information only. thanks } } On Tuesday 28 March 2006 22:26, you wrote: } RAID0 uses all disks evenly (all 2 in your case). I dont see how you } can } recover from a drive failure with a RAID0. Never use RAID0 unless you } are } willing to lose all the data! } } Are you sure the second disk is dead? Have you done a read test on the } disk? dd works well for read testing. Try this: } dd if=/dev/hdb2 of=/dev/null bs=64k } or } dd if=/dev/hdb of=/dev/null bs=64k } } Guy } } } -Original Message- } } From: [EMAIL PROTECTED] [mailto:linux-raid- } } [EMAIL PROTECTED] On Behalf Of Technomage } } Sent: Wednesday, March 29, 2006 12:09 AM } } To: linux-raid@vger.kernel.org } } Subject: recovering data on a failed raid-0 installation } } } } ok, } } here's the situation in a nutshell. } } } } one of the 2 HD's in a linux raid-0 installation has failed. } } } } Fortunately, or otherwise, it was NOT the primary HD. } } } } problem is, I need to recover data from the first drive but appear to } be } } unable to do so because the raid is not complete. the second drive } only } } had } } 193 MB written to it and I am fairly certain that the data I would } like } to } recover is NOT on that drive. } } } } can anyone offer any solutions to this? } } } } the second HD is not usable (heat related failure issues). } } } } The filesystem used on the md0 partition (under mdadm) was xfs. now I } have } tried the xfs_check and xfs_repair tools and they are not helpful } at } this } point. } } } } The developer (of mdadm) suggested I use the following commands in an } } attempt } } to recover: } } } } mdadm -C /dev/md0 -l0 -n2 /dev/.. } } fsck -n /dev/md0 } } } } However, the second one was a no go. } } } } I am stumped as to how to proceed here. I need the data off the first } } drive, } } but do not appear to have any way (other than using cat to see it) to } get } } at } } it. } } } } some help would be greatly appreciated. } } } } technomage } } } } p. here is the original response sent back to me from the developer of } } mdadm: } } *** } } Re: should have been more explicit here - Re: need some help } URGENT! } } From: Neil Brown [EMAIL PROTECTED] } } To: Technomage [EMAIL PROTECTED] } } Date: Sunday 22:01:45 } } On Sunday March 26, [EMAIL PROTECTED] wrote: } } ok, } } } } you gave me more info than some local to that mentioned e-mail list. } } } } ok, the vast majority of the data I need to recover is on /dev/hda } } and /dev/hdb only has 193 MB and is probably irrelevant. } } } } can you help me with this? } } can you baby me through this. I really need to recover this data (if } at } } all } } possible). } } } } Not really, and certainly not now (I have to go out). } } I have already make 2 suggestions } } mail
Re: [PATCH] Add stripe cache entries to raid6 sysfs
On Saturday March 25, [EMAIL PROTECTED] wrote: Raid-6 did not create sysfs entries for stripe cache Signed-off-by: Brad Campbell [EMAIL PROTECTED] --- diff -u vanilla/linux-2.6.16/drivers/md/raid6main.c linux-2.6.16/drivers/md/raid6main.c --- vanilla/linux-2.6.16/drivers/md/raid6main.c 2006-03-20 09:53:29.0 +0400 +++ linux-2.6.16/drivers/md/raid6main.c 2006-03-25 16:35:05.0 +0400 @@ -2148,6 +2148,7 @@ } /* Ok, everything is just fine now */ +sysfs_create_group(mddev-kobj, raid6_attrs_group); mddev-array_size = mddev-size * (mddev-raid_disks - 2); mddev-queue-unplug_fn = raid6_unplug_device; Gee, I wonder I missed that... Thanks! NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
Currently a device failure during recovery leaves bits set in the bitmap. This normally isn't a problem as the offending device will be rejected because of errors. However if device re-adding is being used with non-persistent bitmaps, this can be a problem. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid1.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c --- ./drivers/md/raid1.c~current~ 2006-03-30 16:48:29.0 +1100 +++ ./drivers/md/raid1.c2006-03-30 16:48:40.0 +1100 @@ -1135,8 +1135,19 @@ static int end_sync_write(struct bio *bi mirror = i; break; } - if (!uptodate) + if (!uptodate) { + int sync_blocks = 0; + sector_t s = r1_bio-sector; + long sectors_to_go = r1_bio-sectors; + /* make sure these bits doesn't get cleared. */ + do { + bitmap_end_sync(mddev-bitmap, r1_bio-sector, + sync_blocks, 1); + s += sync_blocks; + sectors_to_go -= sync_blocks; + } while (sectors_to_go 0); md_error(mddev, conf-mirrors[mirror].rdev); + } update_head_pos(mirror, r1_bio); - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock.
And remove the comments that were put in inplace of a fix too Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c |8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2006-03-30 16:48:30.0 +1100 +++ ./drivers/md/md.c 2006-03-30 16:48:47.0 +1100 @@ -214,13 +214,11 @@ static void mddev_put(mddev_t *mddev) return; if (!mddev-raid_disks list_empty(mddev-disks)) { list_del(mddev-all_mddevs); - /* that blocks */ + spin_unlock(all_mddevs_lock); blk_cleanup_queue(mddev-queue); - /* that also blocks */ kobject_unregister(mddev-kobj); - /* result blows... */ - } - spin_unlock(all_mddevs_lock); + } else + spin_unlock(all_mddevs_lock); } static mddev_t * mddev_find(dev_t unit) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache
Signed-off-by: Brad Campbell [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid6main.c |2 ++ 1 file changed, 2 insertions(+) diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c --- ./drivers/md/raid6main.c~current~ 2006-03-30 16:48:30.0 +1100 +++ ./drivers/md/raid6main.c2006-03-30 16:48:52.0 +1100 @@ -2151,6 +2151,8 @@ static int run(mddev_t *mddev) } /* Ok, everything is just fine now */ + sysfs_create_group(mddev-kobj, raid6_attrs_group); + mddev-array_size = mddev-size * (mddev-raid_disks - 2); mddev-queue-unplug_fn = raid6_unplug_device; - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 000 of 3] md: Introduction - assorted fixed for 2.6.16
Following are three patches for md. The first fixes a problem that can cause corruption in fairly unusual circumstances (re-adding a device to a raid1 and suffering write-errors that are subsequntly fixed and the device is re-added again). The other two fix minor problems The are suitable to go straight in to 2.6.17-rc. NeilBrown [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery. [PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock. [PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
NeilBrown [EMAIL PROTECTED] wrote: + if (!uptodate) { +int sync_blocks = 0; +sector_t s = r1_bio-sector; +long sectors_to_go = r1_bio-sectors; +/* make sure these bits doesn't get cleared. */ +do { +bitmap_end_sync(mddev-bitmap, r1_bio-sector, +sync_blocks, 1); +s += sync_blocks; +sectors_to_go -= sync_blocks; +} while (sectors_to_go 0); md_error(mddev, conf-mirrors[mirror].rdev); +} Can mddev-bitmap be NULL? If so, will the above loop do the right thing when this: void bitmap_end_sync(struct bitmap *bitmap, sector_t offset, int *blocks, int aborted) { bitmap_counter_t *bmc; unsigned long flags; /* if (offset == 0) printk(bitmap_end_sync 0 (%d)\n, aborted); */ if (bitmap == NULL) { *blocks = 1024; return; } triggers? - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
On Wednesday March 29, [EMAIL PROTECTED] wrote: NeilBrown [EMAIL PROTECTED] wrote: + if (!uptodate) { + int sync_blocks = 0; + sector_t s = r1_bio-sector; + long sectors_to_go = r1_bio-sectors; + /* make sure these bits doesn't get cleared. */ + do { + bitmap_end_sync(mddev-bitmap, r1_bio-sector, + sync_blocks, 1); + s += sync_blocks; + sectors_to_go -= sync_blocks; + } while (sectors_to_go 0); md_error(mddev, conf-mirrors[mirror].rdev); + } Can mddev-bitmap be NULL? Yes, normally it is. If so, will the above loop do the right thing when this: void bitmap_end_sync(struct bitmap *bitmap, sector_t offset, int *blocks, int aborted) { bitmap_counter_t *bmc; unsigned long flags; /* if (offset == 0) printk(bitmap_end_sync 0 (%d)\n, aborted); */if (bitmap == NULL) { *blocks = 1024; return; } triggers? Yes. sync_blocks will be 1024 (a nice big number) and the loop will exit quite quickly having done nothing (which is what it needs to do in that case). Ofcourse, if someone submits a bio for multiple thousands of sectors it will loop needlessly a few times, but do we ever generate bios that are even close to a megabyte? If so, that 1024 can be safely increased to 120, and possibly higher but I would need to check. Thanks for asking NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
making raid5 more robust against block errors
Is there any work going on to handle readerrors on a raid5 disk being handled by recreating the faulty block from the other disks and just rewriting the block, instead of kicking the disk out? I've problems on several occasions where two disks in a raid5 will have single sector errors and thus it's impossible (afaik) to get the array up and running without a lot of manual intervention and likely data loss, when the information needed to get the array up and running without dataloss is actually there. I know this has been discussed before (I've been in these discussions myself), just wanted to know if this resulted in any improvement? Right now I am more prone to using 3ware hw-raid rather than sw-raid due to this, as it will do the above and handle the read error gracefully. The data integrity is more important than write speed (where sw-raid excels) for me. -- Mikael Abrahamssonemail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html