Brian, Do you have corosync or other Linux HA software infrastructure running on these systems? You need an HA software layer to manage heartbeat monitoring, split-brain protection and mounting/migrating of resources.
--Jeff On 10/11/12 2:02 PM, Andrus, Brian Contractor wrote: > All, > > I am starting to try and configure failover for our lustre filesystem. > Node00 is the mgs/mdt > Node00 is the oss for ost0 and failnode for ost1 > Node01 is the oss for ost1 and failnode for ost0 > > Both osts are on an SRP network and are visible by both nodes. > Ost0 is mounted on node00 > Ost1 is mounted on node01 > > If I try to mount ost0 on node01 I see in the logs for node00: > kernel: Lustre: Denying initial registration attempt from nid > 10.100.255.250@o2ib, specified as failover > > So do I have to manually mount the ost for failover purposes when there is a > fail? > I would have thought I mount the osts on both nodes and lustre will manage > which node is the active node. > > > Brian Andrus > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss -- ------------------------------ Jeff Johnson Co-Founder Aeon Computing jeff.john...@aeoncomputing.com www.aeoncomputing.com t: 858-412-3810 x101 f: 858-412-3845 m: 619-204-9061 /* New Address */ 4170 Morena Boulevard, Suite D - San Diego, CA 92117 _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss