On 10/20/2015 04:43 PM, Michael Torrie wrote:
> On 10/20/2015 04:24 PM, Daniel Fussell wrote:
>> In the past, it was best to have a UPS.  That was long ago, and was
>> largely because (PATA/SATA) drives would lie about completing writes. 
>> In which case, you disabled the write cache.  Which was also something
>> that should have been done if you were using any kind of software raid. 
>>
>> Now that we have write barriers, it's a non-issue.  Also having a
>> battery-backed or flash-backed write cache on your raid controller was a
>> non-issue (assuming your raid controller is smart enough to turn off
>> each drive's write cache).
> What about a home server situation with just a couple of bare drives in
> a computer?  Is there still a potential problem for data loss here?
>
> UPS are quite cheap these days, so it's cheap insurance I guess.
>
>

Even with a UPS you can stop a machine mid-write.  I've got some xen
domUs that went bat-snack crazy after an attempted distro and kernel
upgrade, and I had to destroy the machines while they were somewhere in
applying the journal, or mounting or something.  A couple machines have
a weird empty directory in lost+found I haven't been able to delete
(thinks it isn't empty), but other than that, I've had no problems. 

The worst problem I've ever had was an 8 disk RAID-5 with a punctured
RAID stripe due to a seriously flawed 6-month drive production run in
the Philippine manufacturing plant (somebody used the wrong drive
lubricant).  I ended up with two or three different punctured stripes
over about 5 months. 

The only reason XFS had any problems was because in the last punctured
stripe event, I tried to outsmart the controller and do a full-stripe
write at the failed location in question with dd, miscalculated the
block address, and blew out the wrong stripe.  Then I had two areas in
the filesystem that weren't happy!  Before that, the filesystem would
continue to run as long as you didn't do anything with the punctured
stripe.  I finally gave up on that raid array (as the manufacture
wouldn't replace all the questionable Philippine drives), restored the
data from tape to a decent SAN array, and kicked the crappy drives to
the curb.  Yay for tape backup (even if it did take a week to restore it
all).

I'm still a little torqued at having to eat a bunch of SAS drives, and I
will probably be gun shy with every drive I ever get from now on, but
the filesystem performed admirably under the circumstances.

Grazie,
;-Daniel Fussell


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to