Greetings. We have a lately had a lot of trouble with relatively large (order of 1TB) file systems mounted on RAID 5 or RAID 6 volumes. The file systems in question are based on ext3.
In a typical scenario, we have a drive go bad in a RAID array. We then remove it from the array, if it isn't already, add a new hard drive (i.e., by hand, not from a hot spare), and add it back to the RAID array. The RAID operations are all done using mdadm. After the RAID array has completed its rebuild, we run fsck on the RAID device. When we do that, fsck seems to run forever, i.e., for days at a time, occasionally spitting out messages about files with recognizable names, but never completing satisfactorily. The systems in question are typically running SL 4.x. We've read that the version of fsck that is standard in SL 4 has some known bugs, especially wrt large file systems. Hence, we've attempted to repeat the exercise with fsck.ext3 taken from the Fedora 8 distribution. This gives us improved, but still not satisfactory, results. We usually end up just punting on the fsck: we make a new file system and restore from backups. Maybe I'm just missing something obvious here. I'd like to know if you've had similar experiences and/or if you have a better way to do all this. Thanks. - Mike -- Michael Hannon mailto:[EMAIL PROTECTED] Dept. of Physics 530.752.4966 University of California 530.752.4717 FAX Davis, CA 95616-8677