Hi all
I have a simple 2 disk RAID 1 array which has become corrupted by a
faulty memory module.
If I repeatedly generate an MD5 hash on the same file, I consistantly
get 1 of 2 values back, roughly alternating, so I assume that the 2
disks have different versions of the same file and they are accessed
more-or-less alternately. 'raidclt -s' tells me that all is well with
the array.
It appears that the likelyhood of corruption is greater with larger
files - >approx 1/2 gig are pretty much all corrupt while small files
are pretty much all ok. All this sounds reasonable under the
circumstances.
My idea on recovering as much as possible was to disconnect 1 drive,
copy all the data off, switch to the other drive and do the same, then
run an anaysis on the 2 copies - if a file is the same on both copys,
the it's probably ok, if they differ, then one or both will be bad.
So, I did the first copy, but when I swap to the other disk, RAIDFrame
has remembered that this has 'failed' so will not configure it into the
set (as I feared it would(nt)).
Does anyone know how I can tell RAIDFrame that the first drive is
actually ok, or is my reasoning just nonsense anyway?
What would a parity re-write do in this case?
Ironicaly this computer is in the process of being configured as backup
storage, so while I have the originals of most of the data, there is
some that I dont, and I haven't yet set up the secondary (off site)
backups. And yes I did test the backups were ok, the first ones at
least. It appears the module failed some time during the process. I
know, I should have been anal and checked every single one, but it was
all brand new hardware ...
Actually, that's when failure rates are high.
paulm