Hi Ian,
"Copy from my backup? But, it's RAID! I don't need a backup!" LOLs :-) RAID setup properly with a spare can provide hot fail over (when a disk crashes within the active array it is deprecated and the spare is made active) The primary function of RAID is uptime. RAID is never a substitute for backup unless you have more than one ;-) I'm happy it worked out good for you Ian. All the best, Dan On Mon, Jan 5, 2009 at 7:44 PM, Ian Bruseker <[email protected]> wrote: > Dan, > > Copy from my backup? But, it's RAID! I don't need a backup! > > ;-) > > You appear to be right on the money, which sort of annoys me. :-) > The whole point of RAID (particularly RAID 5) is it's supposed to > bring a level of reliability the system. If a disk fails, data isn't > lost. Bringing the whole system to a hard locked crashing death is > hardly "reliable", unless you count the fact that it did reliably lock > up solid at exactly the same point of the resync each time. I suppose > technically I didn't lose any data, because I was able to copy files > off the array during that time from boot up until it hit the magical > 40.1% (it restarted the sync from zero each time that happened), and > could do it over and over after each restart as often as it took me to > make sure everything was safely backed up (not that I cared - I never > fully trusted the thing anyway, so I'd used it as my junk space, > things I've downloaded, saved somewhere else, backed up to CD/DVD sort > of space, so nothing of value would have been lost anyway even if I > had no backup of it). There were only two files I couldn't copy off, > which I guess must have been sitting on the bad place on the bad disk, > because trying to copy them caused the system to lock up instantly > during the copy. > > But back to you being right. First, I bought a new drive, completely > removed all the partitions from all the drives and started from > scratch. Of course I didn't get the bad drive on the first try, so I > put the whole array together with one existing drive replaced by the > new one, and watched it die promptly at 40.1% of the resync. But I > got it on the second try and watched it happily rebuild all the way to > 100%. So, clearly it's a drive. To put your idea of spare drives to > the test, I rebuilt the array again, with 3 active and one spare > drive, thus including the bad drive in the setup. And wouldn't you > know, it rebuilt the array, and flagged the bad drive as faulty in the > process rather than just falling over dead. How nice. Actually it > flagged two as spare, one (the bad one) as "faulty spare", and left > only one disk active in the RAID 5 array, which makes no sense at all, > but at least it proves out that it could find the faulty drive given > the chance. It even logged a ton of error messages to > /var/log/messages rather than just locking up with no feedback. > > So, there's the lesson for the day, I guess. When running a RAID 5 > with software RAID, put a spare drive in the setup to catch such a > event as a failed disk. I wouldn't have thought it was necessary, but > in this case it seems it is. > > Thanks for the guidance, Dan. You are a guru. :-) > > Ian > > 2009/1/3 Dan Graham <[email protected]> >> >> Hi Ian, >> >> I have seen this happen when you create an mdadm RAID5 array without a >> hot spare drive (4th disk). When a drive in the array fails with only >> 3 disks it cannot rebuild itself without the hot spare. You may be >> able to add an additional disk to the array and then try rebuilding it >> but it will take far less time to create an entirely new array and >> copy your backup data to it. >> >> All the best, Dan >> > > _______________________________________________ > clug-talk mailing list > [email protected] > http://clug.ca/mailman/listinfo/clug-talk_clug.ca > Mailing List Guidelines (http://clug.ca/ml_guidelines.php) > **Please remove these lines when replying > -- One thing you can be sure of. If you throw a loaded gun in monkey cage, something bad is going to happen. _______________________________________________ clug-talk mailing list [email protected] http://clug.ca/mailman/listinfo/clug-talk_clug.ca Mailing List Guidelines (http://clug.ca/ml_guidelines.php) **Please remove these lines when replying

