This morning I came in and our frontend was hanging due to an OST that
died overnight. I copied as much as I could by hand, and I would like to
find a way to track this problem.
RIP <ffffffa030da9c>ksocklnd_lib_zc_capapble RSP 00001007b1efe5b
CR2: 00000000000028
It probably not enough, but its the first time this has happened.
The only thing we do overnight with lustre is to pull down the NCBI
databases, format the nr database, then rsync the contents to a backup
directory.
On all OSTs and 1 MGS/MDT we are running the lustre patched kernel,
2.6.9-42.0.3.EL_lustre.1.5.97smp with Rocks 4.2.1.
I appreciate any help with this.
--
Jeremy Mann
[EMAIL PROTECTED]
University of Texas Health Science Center
Bioinformatics Core Facility
(210) 567-2672
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss