Brian,

Do you have corosync or other Linux HA software infrastructure running 
on these systems? You need an HA software layer to manage heartbeat 
monitoring, split-brain protection and mounting/migrating of resources.

--Jeff

On 10/11/12 2:02 PM, Andrus, Brian Contractor wrote:
> All,
>
> I am starting to try and configure failover for our lustre filesystem.
> Node00 is the mgs/mdt
> Node00 is the oss for ost0 and failnode for ost1
> Node01 is the oss for ost1 and failnode for ost0
>
> Both osts are on an SRP network and are visible by both nodes.
> Ost0 is mounted on node00
> Ost1 is mounted on node01
>
> If I try to mount ost0 on node01 I see in the logs for node00:
>       kernel: Lustre: Denying initial registration attempt from nid 
> 10.100.255.250@o2ib, specified as failover
>
> So do I have to manually mount the ost for failover purposes when there is a 
> fail?
> I would have thought I mount the osts on both nodes and lustre will manage 
> which node is the active node.
>
>
> Brian Andrus
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

/* New Address */
4170 Morena Boulevard, Suite D - San Diego, CA 92117

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to