Re: Resize on dirty array?
On Sun, 13 Aug 2006, dean gaudet wrote: > On Fri, 11 Aug 2006, David Rees wrote: > > > On 8/11/06, dean gaudet <[EMAIL PROTECTED]> wrote: > > > On Fri, 11 Aug 2006, David Rees wrote: > > > > > > > On 8/10/06, dean gaudet <[EMAIL PROTECTED]> wrote: > > > > > - set up smartd to run long self tests once a month. (stagger it > > > > > every > > > > > few days so that your disks aren't doing self-tests at the same > > > > > time) > > > > > > > > I personally prefer to do a long self-test once a week, a month seems > > > > like a lot of time for something to go wrong. > > > > > > unfortunately i found some drives (seagate 400 pata) had a rather negative > > > effect on performance while doing self-test. > > > > Interesting that you noted negative performance, but I typically > > schedule the tests for off-hours anyway where performance isn't > > critical. > > > > How much of a performance hit did you notice? > > i never benchmarked it explicitly. iirc the problem was generally > metadata performance... and became less of an issue when i moved the > filesystem log off the raid5 onto a raid1. unfortunately there aren't > really any "off hours" for this system. the problem reappeared... so i can provide some data. one of the 400GB seagates has been stuck at 20% of a SMART long self test for over 2 days now, and the self-test itself has been going for about 4.5 days total. a typical "iostat -x /dev/sd[cdfgh] 30" sample looks like this: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdc 90.94 137.52 14.70 25.76 841.32 1360.3554.43 0.94 23.30 10.30 41.68 sdd 93.67 140.52 14.96 22.06 863.98 1354.7559.93 0.91 24.50 12.17 45.05 sdf 92.84 136.85 15.36 26.39 857.85 1360.3553.13 0.88 21.04 10.59 44.21 sdg 87.74 137.82 14.23 24.86 807.73 1355.5555.35 0.85 21.86 11.25 43.99 sdh 87.20 134.56 14.96 28.29 810.13 1356.8850.10 1.90 43.72 20.02 86.60 those 5 are in a raid5, so their io should be relatively even... notice the await, svctm and %util of sdh compared to the other 4. sdh is the one with the exceptionally slow going SMART long self-test. i assume it's still making progress because the effect is measurable in iostat. -dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
Neil Brown <[EMAIL PROTECTED]> wrote: > I would be a lot happier about it if the block layer told me whether > the fail was a Media error or some other sort of error. This wouldn't help you either. I've seen drives (mainly Samsung) that locked up the whole IDE bus after some simple (subsequent) sector-read- errors. And with strange IDE drivers (like PDC202XX_NEW) this could also escalate to whole-machine freezes - I've also seen this and had to play hardly with device-mapper's dm-error target to work around this :) So IMHO at least the default behaviour should stay as it currently is: if a drive fails, do never ever touch it again. Perhaps, you could make such a "FAILING" feature somehow configurable (perhaps even on a per-mirror base) in order to allow users to enable it for drives they *do* know they don't show up such a bad behaviour. regards Mario -- Independence Day: Fortunately, the alien computer operating system works just fine with the laptop. This proves an important point which Apple enthusiasts have known for years. While the evil empire of Microsoft may dominate the computers of Earth people, more advanced life forms clearly prefer Macs. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On Saturday August 12, [EMAIL PROTECTED] wrote: > On 8/9/06, James Peverill <[EMAIL PROTECTED]> wrote: > > > > I'll try the force assemble but it sounds like I'm screwed. It sounds > > like what happened was that two of my drives developed bad sectors in > > different places that weren't found until I accessed certain areas (in > > the case of the first failure) and did the drive rebuild (for the second > > failure). In the future, is there a way to help prevent this? > > This is a common scenario, and I feel could be helped if md could be > told to not drop the disk on first failure, but rather keep it running > in "FAILING" status (as opposed to FAILED), until all data from it has > been evacuated (hot spare). This way, if another disk became "failing" > during rebuild, due to another area of the disk, those blocks could be > rebuilt using the other "failing" disk. (Also, this allows for the > rebuild to mostly be a ddrescue-style copy operation, rather than > parity computation). > > Do you guys feel this is feasible? Neil? Maybe I would be a lot happier about it if the block layer told me whether the fail was a Media error or some other sort of error. But something could probably be arranged, and the general idea has been suggested a number of times now, so maybe it really is a good idea :-) I'll put it on my todo list :-) NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On Fri, 11 Aug 2006, David Rees wrote: > On 8/11/06, dean gaudet <[EMAIL PROTECTED]> wrote: > > On Fri, 11 Aug 2006, David Rees wrote: > > > > > On 8/10/06, dean gaudet <[EMAIL PROTECTED]> wrote: > > > > - set up smartd to run long self tests once a month. (stagger it every > > > > few days so that your disks aren't doing self-tests at the same time) > > > > > > I personally prefer to do a long self-test once a week, a month seems > > > like a lot of time for something to go wrong. > > > > unfortunately i found some drives (seagate 400 pata) had a rather negative > > effect on performance while doing self-test. > > Interesting that you noted negative performance, but I typically > schedule the tests for off-hours anyway where performance isn't > critical. > > How much of a performance hit did you notice? i never benchmarked it explicitly. iirc the problem was generally metadata performance... and became less of an issue when i moved the filesystem log off the raid5 onto a raid1. unfortunately there aren't really any "off hours" for this system. -dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On 8/9/06, James Peverill <[EMAIL PROTECTED]> wrote: I'll try the force assemble but it sounds like I'm screwed. It sounds like what happened was that two of my drives developed bad sectors in different places that weren't found until I accessed certain areas (in the case of the first failure) and did the drive rebuild (for the second failure). In the future, is there a way to help prevent this? This is a common scenario, and I feel could be helped if md could be told to not drop the disk on first failure, but rather keep it running in "FAILING" status (as opposed to FAILED), until all data from it has been evacuated (hot spare). This way, if another disk became "failing" during rebuild, due to another area of the disk, those blocks could be rebuilt using the other "failing" disk. (Also, this allows for the rebuild to mostly be a ddrescue-style copy operation, rather than parity computation). Do you guys feel this is feasible? Neil? - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
David Rees wrote: > I personally prefer to do a long self-test once a week, a month seems > like a lot of time for something to go wrong. unfortunately i found some drives (seagate 400 pata) had a rather negative effect on performance while doing self-test. Interesting that you noted negative performance, but I typically schedule the tests for off-hours anyway where performance isn't critical. Personally I have every disk do a short test at 6am Monday-Saturday, and then they *all* (29 of them) do a long test every Sunday at 6am. I figure having all disks do a long test at the same time rather than staggered is going to show up any pending issues with my PSU's also. (Been doing this for nearly 2 years now and had it show up a couple of drives that were slowly growing defects. Nothing a dd if=/dev/zero of=/dev/sd(x) did not fix though) Brad -- "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." -- Douglas Adams - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On 8/11/06, dean gaudet <[EMAIL PROTECTED]> wrote: On Fri, 11 Aug 2006, David Rees wrote: > On 8/10/06, dean gaudet <[EMAIL PROTECTED]> wrote: > > - set up smartd to run long self tests once a month. (stagger it every > > few days so that your disks aren't doing self-tests at the same time) > > I personally prefer to do a long self-test once a week, a month seems > like a lot of time for something to go wrong. unfortunately i found some drives (seagate 400 pata) had a rather negative effect on performance while doing self-test. Interesting that you noted negative performance, but I typically schedule the tests for off-hours anyway where performance isn't critical. How much of a performance hit did you notice? -Dave - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On 8/10/06, dean gaudet <[EMAIL PROTECTED]> wrote: - set up smartd to run long self tests once a month. (stagger it every few days so that your disks aren't doing self-tests at the same time) I personally prefer to do a long self-test once a week, a month seems like a lot of time for something to go wrong. - run nightly diffs of smartctl -a output on all your drives so you see when one of them reports problems in the smart self test or otherwise has a Current_Pending_Sectors or Realloc event... then launch a repair sync_action. You can (and probably should) setup smartd to automatically send out email alerts as well. -Dave - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
> "Mark" == Mark Hahn <[EMAIL PROTECTED]> writes: >> RAID is no excuse for backups. Mark> I wish people would quit saying this: not only is it not helpful, Mark> but it's also wrong. You've got to be kidding, right? A backup is another aspect of data protection. RAID is another form. Both have their uses, and both should be used on any system with important data. You're just spouting the wrong thing here and I really dislike seeing it, which has prompted this reply. Mark> a traditional backup is nothing more than a strangely async Mark> raid1, with the same space inefficiency. tape is not the Mark> answer, and getting more not. the idea of a periodic snapshot Mark> to media which is located apart and not under the same load as Mark> the primary copy is a good one, but not cheap or easy. backups Mark> are also often file-based, which is handy but orthogonal to Mark> being raid (or incremental, for that matter). and backups don't Mark> mean you can avoid the cold calculation of how much reliability Mark> you want to buy. _that_ is how you should choose your storage Mark> architecture... You again mixing up your ideas here. This is the first time I've ever heard someone imply that backups to tape are a form of RAID, never. You really have an interesting point of view here. Now maybe you do have some good points, but they're certainly not articulated clearly. Just to work through them: First, backups to tape may not be cheap or easy, especially with the rise of 250gb disks for $100. Buying a tape drive that has the space and performance to backup that amount of data can be a big investment. Second, reliability is a different measure from that of data retention. I can have the most reliable RAID system on a server which can handle multiple devices failing (because they weren't reliable), or power supply failure or connectivity failures, etc. But if a user deletes a file and it can't be recovered from your RAID system, then how much help has that RAID system been? Now you may argue that reliability includes backups, but that's just wrong. Reliability is a measure of the media/sub-system. It's not a measure of how good your backups are. So you then claim that snapshots are a great way to get cheap and easy backups, especially when you have reliable RAID. So what happens when your building burns down? Or even just your house? (As an aside, while I do backups at home, I don't take them offsite in case of fire. Shame on me, and I'm a SysAdmin by profession!) So how do you know that your snapshots are reliable? Are they filesystem based? Are they volume based? If volume based, how do you get the filesystem in a quiescent state to make sure there's no corruption when you make the snapshot? It's not a trivial problem. And even traditional backups to tape have this issue. I'd write more, but I'm busy with other stuff and I wanted to hear your justifications in more detail before I bothered spending the time to refute them. John - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
suggestions: - set up smartd to run long self tests once a month. (stagger it every few days so that your disks aren't doing self-tests at the same time) - run 2.6.15 or later so md supports repairing read errors from the other drives... - run 2.6.16 or later so you get the check and repair sync_actions in /sys/block/mdX/md/sync_action (i think 2.6.16.x still has a bug where you have to echo a random word other than repair to sync_action to get a repair to start... wrong sense on a strcmp, fixed in 2.6.17). - run nightly diffs of smartctl -a output on all your drives so you see when one of them reports problems in the smart self test or otherwise has a Current_Pending_Sectors or Realloc event... then launch a repair sync_action. - proactively replace your disks every couple years (i prefer to replace busy disks before 3 years). -dean On Wed, 9 Aug 2006, James Peverill wrote: > > In this case the raid WAS the backup... however it seems it turned out to be > less reliable than the single disks it was supporting. In the future I think > I'll make sure my disks have varying ages so they don't fail all at once. > > James > > > > RAID is no excuse for backups. > PS: > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
No, it wasn't *less* reliable than a single drive; you benefited as soon as a James Peverill wrote: > > In this case the raid WAS the backup... however it seems it turned out > to be less reliable than the single disks it was supporting. In the > future I think I'll make sure my disks have varying ages so they don't > fail all at once. > be at the moment. With RAID you then stressed the remaining drives to the point of a second failure (not that you had much choice - you *could* have spent money > James > >>> RAID is no excuse for backups. on enough media to mirror your data whilst you played with your only remaining I can't see where you mention the kernel version you're running? md can perform validation sync's on a periodic basis in later kernels - Debian's mdadm enables this in cron. copy - that's a cost/risk tradeoff you chose not to make. I've made the same choice in the past - I've been lucky - you were not - sorry.) > PS: > - David > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > drive failed. At that point you would have been just as toasted as you may well PS Reorganise lines from distributed reply as you like :) -- - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
James Peverill wrote: > I'll try the force assemble but it sounds like I'm screwed. It > sounds like what happened was that two of my drives developed bad > sectors in different places that weren't found until I accessed > certain areas (in the case of the first failure) and did the drive > rebuild (for the second failure). The file /sys/block/mdX/md/sync_action can be used to issue a recheck of the data. Read Documentation/md.txt in kernel source for details about the exact procedure. My advice (if you still want to continue using software raid) is that you run such a check before any add/grow or other action in the future. Also, if the raid has been unused for a long while it might be a good idea to recheck the data. [snip] I feel your pain. Massive data loss is the worst. I have had my share of crashes. Once due to bad disk and no redundancy, the other time due to good old stupidity. Henrik Holst - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
In this case the raid WAS the backup... however it seems it turned out to be less reliable than the single disks it was supporting. In the future I think I'll make sure my disks have varying ages so they don't fail all at once. James RAID is no excuse for backups. PS: - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
failure). In the future, is there a way to help prevent this? sure; periodic scans (perhaps smartctl) of your disks would prevent it. I suspect that throttling the rebuild rate is also often a good idea if there's any question about disk reliability. RAID is no excuse for backups. I wish people would quit saying this: not only is it not helpful, but it's also wrong. a traditional backup is nothing more than a strangely async raid1, with the same space inefficiency. tape is not the answer, and getting more not. the idea of a periodic snapshot to media which is located apart and not under the same load as the primary copy is a good one, but not cheap or easy. backups are also often file-based, which is handy but orthogonal to being raid (or incremental, for that matter). and backups don't mean you can avoid the cold calculation of how much reliability you want to buy. _that_ is how you should choose your storage architecture... regards, mark hahn. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
2006/8/9, James Peverill <[EMAIL PROTECTED]>: failure). In the future, is there a way to help prevent this? RAID is no excuse for backups. smartd may warn you in advance. Best Martin PS: http://en.wikipedia.org/wiki/Top-posting#Top-posting - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
I'll try the force assemble but it sounds like I'm screwed. It sounds like what happened was that two of my drives developed bad sectors in different places that weren't found until I accessed certain areas (in the case of the first failure) and did the drive rebuild (for the second failure). In the future, is there a way to help prevent this? Given that the bad sectors were likely on different parts of their respective drives, I should still have a complete copy of all the data right? Is it possible to recover from a partial two-disk failure using all the disks? It looks like I might as well cut my losses and buy new disks (I suspect the last two drives are near death given whats happened to their brethren). If I go SATA am I better off getting 2 dual port cards or 1 four port? Thanks again. James Neil Brown wrote: On Tuesday August 8, [EMAIL PROTECTED] wrote: The resize went fine, but after re-adding the drive back into the array I got another fail event (on another drive) about 23% through the rebuild :( Did I have to "remove" the bad drive before re-adding it with mdadm? I think my array might be toast... You wouldn't be able to re-add the drive without removing it first. But why did you re-add the failed drive? Why not add the new one? Or maybe you did... 2 drives failed - yes - that sounds a bit like toast. You can possible do a --force assemble without the new drive and try to backup the data somewhere - if you have somewhere large enough. NeilBrown Any tips on where I should go now? Thanks for the help. James - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On Tuesday August 8, [EMAIL PROTECTED] wrote: > > The resize went fine, but after re-adding the drive back into the array > I got another fail event (on another drive) about 23% through the > rebuild :( > > Did I have to "remove" the bad drive before re-adding it with mdadm? I > think my array might be toast... > You wouldn't be able to re-add the drive without removing it first. But why did you re-add the failed drive? Why not add the new one? Or maybe you did... 2 drives failed - yes - that sounds a bit like toast. You can possible do a --force assemble without the new drive and try to backup the data somewhere - if you have somewhere large enough. NeilBrown > Any tips on where I should go now? > > Thanks for the help. > > James - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
The resize went fine, but after re-adding the drive back into the array I got another fail event (on another drive) about 23% through the rebuild :( Did I have to "remove" the bad drive before re-adding it with mdadm? I think my array might be toast... Any tips on where I should go now? Thanks for the help. James Neil Brown wrote: On Monday August 7, [EMAIL PROTECTED] wrote: I have a software raid 5 setup with four drives. One drive failed. I got a replacement but unfortunately it turns out that my original disks were just a few gigs over the replacement. It seems that most manufacturers don't actually advertise the REAL capacity of the disk, so getting one that is the same size as the old ones could be tough.(and they aren't available anymore of course...) So my question... can I resize the array while it is missing a drive? The raid is <50% full, and the few gigs is only a few percent. In retrospect I shouldn't have sized them right to the limit... Yes, that should work. First resize the filesystem to make it smaller. Then resize the array mdadm --grow /dev/mdX --size=whatever You have to calculate 'whatever' yourself. It is in kibibytes and must be 128K < size of new drive, and obviously must leave room for the filesystem. A good suggestion is: shrink the filesystem a lot. shrink the array an adequate amount add the new drive resize the array up to 'max' (mdadm -G /dev/mdX --size=max) resize the filesystem up to max. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resize on dirty array?
On Monday August 7, [EMAIL PROTECTED] wrote: > > I have a software raid 5 setup with four drives. One drive failed. I > got a replacement but unfortunately it turns out that my original disks > were just a few gigs over the replacement. It seems that most > manufacturers don't actually advertise the REAL capacity of the disk, so > getting one that is the same size as the old ones could be tough.(and > they aren't available anymore of course...) > > So my question... can I resize the array while it is missing a drive? > The raid is <50% full, and the few gigs is only a few percent. In > retrospect I shouldn't have sized them right to the limit... Yes, that should work. First resize the filesystem to make it smaller. Then resize the array mdadm --grow /dev/mdX --size=whatever You have to calculate 'whatever' yourself. It is in kibibytes and must be 128K < size of new drive, and obviously must leave room for the filesystem. A good suggestion is: shrink the filesystem a lot. shrink the array an adequate amount add the new drive resize the array up to 'max' (mdadm -G /dev/mdX --size=max) resize the filesystem up to max. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Resize on dirty array?
I have a software raid 5 setup with four drives. One drive failed. I got a replacement but unfortunately it turns out that my original disks were just a few gigs over the replacement. It seems that most manufacturers don't actually advertise the REAL capacity of the disk, so getting one that is the same size as the old ones could be tough.(and they aren't available anymore of course...) So my question... can I resize the array while it is missing a drive? The raid is <50% full, and the few gigs is only a few percent. In retrospect I shouldn't have sized them right to the limit... Any ideas would be appreciated. Thanks! James - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html