[Lustre-discuss] Failover possibly not working

Jeremy Mann Tue, 06 Nov 2007 09:26:21 -0800

I have set up 20 compute nodes as OSTs, one off each other like 
compute-0-0 -> 0-1, 0-2 -> 0-3 and so on. However this morning, one of 
the drives in a OST failed. The node didn't reboot, it just remounted 
its lustre OST device read-only. This caused our normal storage scripts 
to fail.


I had to reboot the node anyway to replace the drive, so that's when the 
failover to the next node happened. I can see on the Meta server that 
Lustre did indeed switch to the failover node, however, the files that 
were associated with that node are visible but not readable. Shouldn't 
the failover node have prevented this?

The drive that failed is completely dead, I can't even mount it to try a 
dd to restore the filesystem, so it looks like I'm going to have to 
rebuild the filesystem.


-- 
Jeremy Mann
[EMAIL PROTECTED]
University of Texas Health Science Center
Bioinformatics Core Facility
(210) 567-2672

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

[Lustre-discuss] Failover possibly not working

Reply via email to