On 10/29/07, Kazuki Ohara <[EMAIL PROTECTED]> wrote: > Brian J. Murrell wrote: > > On Fri, 2007-10-26 at 16:03 +0900, Kazuki Ohara wrote: > >> Hi Brain, > >> Thank you for your answer. > > > > NP. > > > >> By the way, I doubt the need of the --failnode directive. > > > > It's needed. That is how the MGS and thusly all other nodes learn of an > > OSS's failover partner. This information is communicated via the > > mkfs.lustre command to the MGS. > > uh... > Thank you for you answer, but, > I can't find out the reason why MGS and OSS need to learn of the failover > partner. > By that information, does MGS or OSS request the partner not to access the > shared volume > or something special requests?
I am sure someone who understands Lustre internals can tackle this question better, however, from my understanding: The MGS keeps track of all data as it's written to the OST in question, as well as the OSS responsible for the OST. By creating a pair of OSS systems, one is effectively delegating responsibility of the back-end storage to a pair of OSS machines and ensuring a seamless fail-over by redirecting client requests transparently. The second part of your question is "how do you tell the stand-by OSS not to access the volume?". The Lustre MGS is not going to direct client requests to the stand-by node for the OST in question when a client request comes in, hence, the shared or replicated OST on the fail-over pair need not be mounted or actively available. In case a fail-over is required, the shared/replicated storage device is mounted on the stand-by OSS and Lustre fail-over requested via the MGS. > Excuse my persistent question. Hope the explanation helped -- I am sure CFS/Sun can clarify further. -mustafa. _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
