Sander Temme wrote:
Folks,
You may have noticed (or not) that Clarus has not been doing its Gump
runs for a week or two. The issue was that both of the drives that make
up the RAID-1 Gump sits on suddenly went out of commission, without any
notice or warning. This is not supposed to happen, and is exactly the
reason those drives are mirrored.
we call this "Raid minus one", in which you think your disks are
mirrored, but they arent. It is actually a worse state than raid-0, "no
raid stuff at all", because at least there you know your data is
vulnerable.
However, when I visited the colocation
facility last week, I shut the box down, pulled and re-seated these
drives and they are now once again available. The fact that they can up
and disappear like this is kind of scary, but I'm glad they are not
actually broken.
This is one of this things that are really hard to test.
I've seen SCSI controllers take down drives that were taking too long to
respond; sometimes this can be a transient event, or it can be a
precursor of trouble to come. It could also be the raid controller that
is failing too -they have their own MTBF, see.
So, Gump runs are now back on Clarus, running at the same times as on
vmgump except using gump/trunk.
Results as always available at http://clarus.apache.org/
S.
[EMAIL PROTECTED] http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4 B7B8 B2BE BC40 1529 24AF
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]