David:

On 3/14/07, David Vasil <[EMAIL PROTECTED]> wrote:

What are people doing for failover (at the lustre layer) under 1.4.X
series lustre?  Specifically the failing of OSTs between a failed host
and its failover pair.


The failover bit is easily controlled via lconf and grouping of nodes. The
issue, as you list it further on:

Under 1.4.9 I have found that the --group feature to lconf does not
appear to work.  Likewise I have had issues with "lconf --cleanup
--force --service <ost> <config file>" trying to unload all of lustre
modules on a running OSS (which leaves the OSS in somewhat of a bad
state).


is exactly what I am facing as well. Recently on a 1.4.9 cluster while
trying to 'fail-back' an ost to the primary oss, the secondary oss which had
taken over services refused to give them up. Unfortunately time on that
cluster was limited for me and I am relegated to setting 1.4.9 up on a few
new systems to carry on testing.

I will update you (and all) within 2 days hopefully. If anyone else can pipe
in what David and myself may be doing wrong given the lconf commands listed
above, it would greatly help.

Regards,
--
Mustafa A. Hashmi
[EMAIL PROTECTED]
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to