Very odd raid1 behavior/failure

Scott Patten Sat, 22 Jan 2000 10:16:23 -0800
I had a raid1 that consisted of a U2W drive and an IDE 
drive.  I removed the IDE drive and replaced it with 
another U2W drive.  About 36 hours later I started having 
problems.  I received lots of errors like the following:

Jan 20 15:18:39 solitude kernel: EXT2-fs error (device 
md(9,0)): ext2_readdir: bad entry in directory #705839: 
rec_len is too small for
name_len - offset=36, inode=705871, rec_len=12, name_len=12 


About 72 hours later I ran e2fsck multiple times and 
watched 50 or so files show up in lost+found.

I thought that maybe the drive that I recently added to the 
array was having problems so I unplugged it.  The remaining 
drive came up in degraded mode after a breif bout with 
e2fsck.  The data on this drive was almost 2 days old, 
however.  I then swapped the drives and booted with what I 
thought was the "bad" drive.  This drive contained many 
more many errors but it also had current data.

This was baffeling because it appeared as if, without 
warning (I checked /proc/mdstat), a drive was shut down - 
THE WRONG DRIVE.

Actually I can't tell for sure what the problem really is. 
I would like to somehow exercise the drives individually 
and together.  I had received a couple of SCSI timouts so 
now I am suspicious of the drives and the controller and 
the cable and the termination, etc even though all this 
equipment is almost new and had worked in another server 
until now (with timeouts which I thought were a SCSI+VIA 
issue).  I'm now running with 2 IDE drives (hopefully not 
for long) so I can test the SCSI system if possible.

Does anyone have any thoughts on this?  Any help is greatly 
appreciated.

Cheers,

Scott

<<<<<<<<<<<<<<<<<<------------------>>>>>>>>>>>>>>>>>>>>>
Scott Patten
Chisco
Very odd raid1 behavior/failure

Reply via email to