Sander Temme wrote:
Folks,

You may have noticed (or not) that Clarus has not been doing its Gump runs for a week or two. The issue was that both of the drives that make up the RAID-1 Gump sits on suddenly went out of commission, without any notice or warning. This is not supposed to happen, and is exactly the reason those drives are mirrored.

we call this "Raid minus one", in which you think your disks are mirrored, but they arent. It is actually a worse state than raid-0, "no raid stuff at all", because at least there you know your data is vulnerable.


However, when I visited the colocation facility last week, I shut the box down, pulled and re-seated these drives and they are now once again available. The fact that they can up and disappear like this is kind of scary, but I'm glad they are not actually broken.

This is one of this things that are really hard to test.

I've seen SCSI controllers take down drives that were taking too long to respond; sometimes this can be a transient event, or it can be a precursor of trouble to come. It could also be the raid controller that is failing too -they have their own MTBF, see.

So, Gump runs are now back on Clarus, running at the same times as on vmgump except using gump/trunk.

Results as always available at http://clarus.apache.org/

S.

[EMAIL PROTECTED]              http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4  B7B8 B2BE BC40 1529 24AF



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to